Skip to main content

Full text of "principles of biochemistry by Moran Horton 5ed"

See other formats

This page intentionally left blank 

Principles of Biochemistry 

This page intentionally left blank 

Principles of Biochemistry 

Fifth Edition 

Laurence A. Moran 

University of Toronto 

H. Robert Horton 

North Carolina State University 

K. Gray Scrimgeour 

University of Toronto 

Marc D. Perry 

University of Toronto 


Boston Columbus Indianapolis New York San Francisco Upper Saddle River 
Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto 
Delhi Mexico City Sao Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo 

Editor in Chief: Adam Jaworski 
Executive Editor: Jeanne Zalesky 
Marketing Manager: Erin Gardner 
Project Editor: Jennifer Hart 
Associate Editor: Jessica Neumann 
Editorial Assistant: Lisa Tarabokjia 
Marketing Assistant: Nicola Houston 

Vice President, Executive Director of Development: Carol Truehart 
Developmental Editor: Michael Sypes 

Managing Editor, Chemistry and Geosciences: Gina M. Cheselka 
Project Manager, Science: Wendy Perez 
Senior Technical Art Specialist: Connie Long 
Art Studios: Mark Landis Illustrations 
/Jonathan Parrish 
/2064 Design — Greg Gambino 
Image Resource Manager: Maya Melenchuk 
Photo Researcher: Eric Schrader 
Art Manager: Marilyn Perry 
Interior/Cover Designer: Tamara Newnam 
Media Project Manager: Shannon Kong 
Senior Manufacturing and Operations Manager: Nick Sklitsis 
Operations Specialist: Maura Zaldivar 
Composition/Full Service: Nesbitt Graphics, Inc. 

Cover Illustration: Quade Paul, Echo Medical Media 

Cover Image Credit: Monkey adapted from Simone van den Berg/Shutterstock 

Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook 
appear on page 767. 

Copyright ©2012, 2006, 2002, 1996 Pearson Education, Inc., All rights reserved. Manufactured in the United 
States of America. This publication is protected by Copyright and permission should be obtained from the 
publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by 
any means, electronic, mechanical, photocopying, recording, or likewise. To obtain permission (s) to use material 
from this work, please submit a written request to Pearson Education, Inc., Permissions Department, 

1900 E. Lake Ave., Glenview, IL 60025. For information regarding permissions, call (847) 486-2635. 

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as 
trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the 
designations have been printed in initial caps or all caps. 

Library of Congress Cataloging-in-Publication Data 

Principles of biochemistry / H. Robert Horton ... [et al]. — 5th ed. 
p. cm. 

ISBN 0-321-70733-8 

1. Biochemistry. I. Horton, H. Robert, 1935- 
QP514.2.P745 2012 
612'. 015 — dc23 


ISBN 10: 0-321-70733-8 

ISBN 13: 978-0-321-70733-8 
123456789 10— DOW— 16 15 14 13 12 


Science should be as simple as possible, 
but not simpler. 

- Albert Einstein 

This page intentionally left blank 

Brief Contents 

Part One 


1 Introduction to Biochemistry l 

2 Water 28 

Part Two 

Structure and Function 

3 Amino Acids and the Primary Structures of Proteins 55 

4 Proteins: Three-Dimensional Structure and Function 85 

5 Properties of Enzymes 134 

6 Mechanisms of Enzymes 162 

7 Coenzymes and Vitamins 196 

8 Carbohydrates 227 

9 Lipids and Membranes 256 

Part Three 

Metabolism and Bioenergetics 

10 Introduction to Metabolism 294 

11 Glycolysis 325 

12 Gluconeogenesis, the Pentose Phosphate 
Pathway, and Glycogen Metabolism 355 

13 The Citric Acid Cycle 385 

14 Electron Transport and ATP Synthesis 417 

15 Photosynthesis 443 

16 Lipid Metabolism 475 

17 Amino Acid Metabolism 514 

18 Nucleotide Metabolism 550 

Part Four 

Biological Information Flow 

19 Nucleic Acids 573 

20 DNA Replication, Repair, and Recombination 601 

21 Transcription and RNA Processing 634 

22 Protein Synthesis 666 



To the Student xxiii 

Preface xxv 

About the Authors xxxiii 

Part One 


1 Introduction to Biochemistry 1 

1.1 Biochemistry Is a Modern Science 2 

1.2 The Chemical Elements of Life 3 

1.3 Many Important Macromolecules Are Polymers 4 

A. Proteins 6 

B. Polysaccharides 6 

C. Nucleic Acids 7 

D. Lipids and Membranes 9 

1.4 The Energetics of Life 10 

A. Reaction Rates and Equilibria 11 

B. Thermodynamics 12 

C. Equilibrium Constants and Standard Gibbs Free Energy Changes 13 

D. Gibbs Free Energy and Reaction Rates 14 

1.5 Biochemistry and Evolution 15 

1.6 The Cell Is the Basic Unit of Life 17 

1.7 Prokaryotic Cells: Structural Features 17 

1.8 Eukaryotic Cells: Structural Features 18 

A. The Nucleus 20 

B. The Endoplasmic Reticulum and Golgi Apparatus 20 

C. Mitochondria and Chloroplasts 21 

D. Specialized Vesicles 22 

E. The Cytoskeleton 23 

1.9 A Picture of the Living Cell 23 

1.10 Biochemistry Is Multidisciplinary 26 

Appendix: The Special Terminology of Biochemistry 26 
Selected Readings 27 

2 Water 28 

2.1 The Water Molecule Is Polar 29 

2.2 Hydrogen Bonding in Water 30 
Box 2.1 Extreme Thermophiles 32 

2.3 Water Is an Excellent Solvent 32 

A. Ionic and Polar Substances Dissolve in Water 32 
Box 2.2 Blood Plasma and Seawater 33 

B. Cellular Concentrations and Diffusion 34 

C. Osmotic Pressure 34 

2.4 Nonpolar Substances Are Insoluble in Water 35 


2.5 Noncovalent Interactions 37 

A. Charge-Charge Interactions 37 

B. Hydrogen Bonds 37 

C. Van der Waals Forces 38 

D. Hydrophobic Interactions 39 

2.6 Water Is Nucleophilic 39 

Box 2.3 The Concentration of Water 41 

2.7 Ionization of Water 41 

2.8 The pH Scale 43 

Box 2.4 The Little “p” in pH 44 

2.9 Acid Dissociation Constants of Weak Acids 44 

Sample Calculation 2.1 Calculating the pH of Weak Acid Solutions 49 

2.10 Buffered Solutions Resist Changes in pH 50 
Sample Calculation 2.2 Buffer Preparation 50 
Summary 52 

Problems 52 
Selected Readings 54 


Structure and Function 

3 Amino Acids and the Primary Structures of Proteins 55 

3.1 General Structure of Amino Acids 56 

3.2 Structures of the 20 Common Amino Acids 58 

Box 3.1 Fossil Dating by Amino Acid Racemization 58 

A. Aliphatic R Groups 59 

B. Aromatic R Groups 59 

C. R Groups Containing Sulfur 60 

D. Side Chains with Alcohol Groups 60 
Box 3.2 An Alternative Nomenclature 61 

E. Positively Charged R Groups 61 

F. Negatively Charged R Groups and Their Amide Derivatives 62 

G. The Hydrophobicity of Amino Acid Side Chains 62 

3.3 Other Amino Acids and Amino Acid Derivatives 62 

3.4 Ionization of Amino Acids 63 

Box 3.3 Common Names of Amino Acids 64 

3.5 Peptide Bonds Link Amino Acids in Proteins 67 

3.6 Protein Purification Techniques 68 

3.7 Analytical Techniques 70 

3.8 Amino Acid Composition of Proteins 73 

3.9 Determining the Sequence of Amino Acid Residues 74 

3.10 Protein Sequencing Strategies 76 

3.11 Comparisons of the Primary Structures of 
Proteins Reveal Evolutionary Relationships 79 
Summary 82 

Problems 82 
Selected Readings 84 

4 Proteins: Three-Dimensional Structure and Function 85 

4.1 There Are Four Levels of Protein Structure 87 

4.2 Methods for Determining Protein Structure 88 


4.3 The Conformation of the Peptide Group 91 

Box 4.1 Flowering Is Controlled by Cis/Trans Switches 93 

4.4 The a Helix 94 

4.5 (3 Strands and f3 Sheets 97 

4.6 Loops and Turns 98 

4.7 Tertiary Structure of Proteins 99 

A. Supersecondary Structures 100 

B. Domains 101 

C. Domain Structure, Function, and Evolution 102 

D. Intrinsically Disordered Proteins 102 

4.8 Quaternary Structure 103 

4.9 Protein-Protein Interactions 109 

4.10 Protein Denaturation and Renaturation 110 

4.11 Protein Folding and Stability 114 

A. The Hydrophobic Effect 114 

B. Hydrogen Bonding 115 

Box 4.2 CASP: The Protein Folding Game 116 

C. Van der Waals Interactions and Charge-Charge Interactions 117 

D. Protein Folding Is Assisted by Molecular Chaperones 117 

4.12 Collagen, a Fibrous Protein 119 
Box 4.3 Stronger Than Steel 121 

4.13 Structure of Myoglobin and Hemoglobin 122 

4.14 Oxygen Binding to Myoglobin and Hemoglobin 123 

A. Oxygen Binds Reversibly to Heme 123 

B. Oxygen-Binding Curves of Myoglobin and Hemoglobin 124 
Box 4.4 Embryonic and Fetal Hemoglobins 126 

C. Hemoglobin Is an Allosteric Protein 127 

4.15 Antibodies Bind Specific Antigens 129 
Summary 130 

Problems 131 
Selected Readings 133 

5 Properties of Enzymes 134 

5.1 The Six Classes of Enzymes 136 

Box 5.1 Enzyme Classification Numbers 137 

5.2 Kinetic Experiments Reveal Enzyme Properties 138 

A. Chemical Kinetics 138 

B. Enzyme Kinetics 139 

5.3 The Michaelis-Menten Equation 140 

A. Derivation of the Michaelis-Menten Equation 141 

B. The Calalytic Constant K cat 143 

C. The Meanings of K m 144 

5.4 Kinetic Constants Indicate Enzyme Activity and Catalytic Proficiency 

5.5 Measurement of K m and l/ max 145 

Box 5.2 Hyperbolas Versus Straight Lines 146 

5.6 Kinetics of Multisubstrate Reactions 147 

5.7 Reversible Enzyme Inhibition 148 

A. Competitive Inhibition 149 

B. Uncompetitive Inhibition 150 



C. Noncompetitive Inhibition 150 

D. Uses of Enzyme Inhibition 151 

5.8 Irreversible Enzyme Inhibition 152 

5.9 Regulation of Enzyme Activity 153 

A. Phosphofructokinase Is an Allosteric Enzyme 154 

B. General Properties of Allosteric Enzymes 155 

C. Two Theories of Allosteric Regulation 156 

D. Regulation by Covalent Modification 158 

5.10 Multienzyme Complexes and Multifunctional Enzymes 158 
Summary 159 

Problems 159 
Selected Readings 161 

6 Mechanisms of Enzymes 162 

6.1 The Terminology of Mechanistic Chemistry 162 

A. Nucleophilic Substitutions 163 

B. Cleavage Reactions 163 

C. Oxidation-Reduction Reactions 164 

6.2 Catalysts Stabilize Transition States 164 

6.3 Chemical Modes of Enzymatic Catalysis 166 

A. Polar Amino Acids Residues in Active Sites 166 

Box 6.1 Site-Directed Mutagenesis Modifies Enzymes 167 

B. Acid-Base Catalysis 168 

C. Covalent Catalysis 169 

D. pH Affects Enzymatic Rates 170 

6.4 Diffusion-Controlled Reactions 171 

A. Triose Phosphate Isomerase 172 

Box 6.2 The “Perfect Enzyme”? 174 

B. Superoxide Dismutase 175 

6.5 Modes of Enzymatic Catalysis 175 

A. The Proximity Effect 176 

B. Weak Binding of Substrates to Enzymes 178 

C. Induced Fit 179 

D. Transition State Stabilization 180 

6.6 Serine Proteases 183 

A. Zymogens Are Inactive Enzyme Precursors 183 
Box 6.3 Kornberg’s Ten Commandments 183 

B. Substrate Specificity of Serine Proteases 184 

C. Serine Proteases Use Both the Chemical 
and the Binding Modes of Catalysis 185 

Box 6.4 Clean Clothes 186 
Box 6.5 Convergent Evolution 187 

6.7 Lysozyme 187 

6.8 Arginine Kinase 190 
Summary 192 
Problems 193 
Selected Readings 194 



Coenzymes and Vitamins 196 

7.1 Many Enzymes Require Inorganic Cations 197 

7.2 Coenzyme Classification 197 

7.3 ATP and Other Nucleotide Cosubstrates 198 
Box 7.1 Missing Vitamins 200 

7.4 NAD© and NADP© 200 

Box 7.2 NAD Binding to Dehydrogenases 203 

7.5 FAD and FMN 204 

7.6 Coenzyme A and Acyl Carrier Protein 204 

7.7 Thiamine Diphosphate 206 

7.8 Pyridoxal Phosphate 207 

7.9 Vitamin C 209 

7.10 Biotin 211 

Box 7.3 One Gene: One Enzyme 212 

7.11 Tetrahydrofolate 213 

7.12 Cobalamin 215 

7.13 Lipoamide 216 

7.14 Lipid Vitamins 217 

A. Vitamin A 217 

B. Vitamin D 218 

C. Vitamin E 218 

D. Vitamin K 218 

7.15 Ubiquinone 219 

Box 7.4 Rat Poison 220 

7.16 Protein Coenzymes 221 

7.17 Cytochromes 221 

Box 7.5 Noble Prizes for Vitamins and Coenzymes 223 
Summary 223 
Problems 224 
Selected Readings 226 

8 Carbohydrates 227 

8.1 Most Monosaccharides Are Chiral Compounds 228 

8.2 Cyclization of Aldoses and Ketoses 230 

8.3 Conformations of Monosaccharides 234 

8.4 Derivatives of Monosaccharides 235 

A. Sugar Phosphates 235 

B. Deoxy Sugars 235 

C. Amino Sugars 235 

D. Sugar Alcohols 236 

E. Sugar Acids 236 

8.5 Disaccharides and Other Glycosides 236 

A. Structures of Disaccharides 237 

B. Reducing and Nonreducing Sugars 238 

C. Nucleosides and Other Glycosides 239 
Box 8.1 The Problem with Cats 240 

8.6 Polysaccharides 240 

A. Starch and Glycogen 240 

B. Cellulose 243 


C. Chitin 244 

8.7 Glycoconjugates 244 

A. Proteoglycans 244 

Box 8.2 Nodulation Factors Are Lipo-Oligosaccharides 246 

B. Peptidoglycans 246 

C. Glycoproteins 248 

Box 8.3 ABO Blood Group 250 
Summary 252 
Problems 253 
Selected Readings 254 

9 Lipids and Membranes 256 

9.1 Structural and Functional Diversity of Lipids 256 

9.2 Fatty Acids 256 

Box 9.1 Common Names of Fatty Acids 258 
Box 9.2 Trans Fatty Acids and Margarine 259 

9.3 Triacylglycerols 261 

9.4 Glycerophospholipids 262 

9.5 Sphingolipids 263 

9.6 Steroids 266 

9.7 Other Biologically Important Lipids 268 

9.8 Biological Membranes 269 

A. Lipid Bilayers 269 

Box 9.3 Gregor Mendel and Gibberellins 270 

B. Three Classes of Membrane Proteins 270 
Box 9.4 New Lipid Vesicles, or Liposomes 272 

Box 9.5 Some Species Have Unusual Lipids in Their Membranes 274 

C. The Fluid Mosaic Model of Biological Membranes 274 

9.9 Membranes Are Dynamic Structures 275 

9.10 Membrane Transport 277 

A. Thermodynamics of Membrane Transport 278 

B. Pores and Channels 279 

C. Passive Transport and Facilitated Diffusion 280 

D. Active Transport 282 

E. Endocytosis and Exocytosis 283 

9.11 Transduction of Extracellular Signals 283 

A. Receptors 283 

Box 9.6 The Hot Spice of Chili Peppers 284 

B. Signal Transducers 285 

C. The Adenylyl Cyclase Signaling Pathway 287 

D. The Inositol-Phospholipid Signaling Pathway 287 
Box 9.7 Bacterial Toxins and G Proteins 290 

E. Receptor Tyrosine Kinases 290 
Summary 291 

Problems 292 
Selected Readings 293 



Metabolism and Bioenergetics 

10 Introduction to Metabolism 294 

10.1 Metabolism Is a Network of Reactions 294 

10.2 Metabolic Pathways 297 

A. Pathways Are Sequences of Reactions 297 

B. Metabolism Proceeds by Discrete Steps 297 

C. Metabolic Pathways Are Regulated 297 

D. Evolution of Metabolic Pathways 301 

10.3 Major Pathways in Cells 302 

10.4 Compartmentation and Interorgan Metabolism 304 

10.5 Actual Gibbs Free Energy Change, Not Standard Free Energy Change, 
Determines the Direction of Metabolic Reactions 306 

Sample Calculation 10.1 Calculating Standard Gibbs Free Energy 
Change from Energies of Formation 308 

10.6 The Free Energy of ATP Hydrolysis 308 

10.7 The Metabolic Roles of ATP 311 

A. Phosphoryl Group Transfer 311 

Sample Calculation 10.2 Gibbs Free Energy Change 312 
Box 10.1 The Squiggle 312 

B. Production of ATP by Phosphoryl Group Transfer 314 

C. Nucleotidyl Group Transfer 315 

10.8 Thioesters Have High Free Energies of Hydrolysis 316 

10.9 Reduced Coenzymes Conserve Energy from Biological Oxidations 316 

A. Gibbs Free Energy Change Is Related to Reduction Potential 317 

B. Electron Transfer from NADH Provides Free Energy 319 

Box 10.2 NAD© and NADH Differ in Their Ultraviolet Absorption Spectra 

10.10 Experimental Methods for Studying Metabolism 321 
Summary 322 

Problems 323 
Selected Readings 324 

1 1 Glycolysis 325 

11.1 The Enzymatic Reactions of Glycolysis 326 

11.2 The Ten Steps of Glycolysis 326 

1. Hexokinase 326 

2. Glucose 6-Phosphate Isomerase 327 

3. Phosphofructokinase-1 330 

4. Aldolase 330 

Box 11.1 A Brief History of the Glycolysis Pathway 331 

5. Triose Phosphate Isomerase 332 

6. Glyceraldehyde 3-Phosphate Dehydrogenase 333 

7. Phosphoglycerate Kinase 335 

Box 11.2 Formation of 2,3-S/sphosphoglycerate in Red Blood Cells 335 
Box 11.3 Arsenate Poisoning 336 

8. Phosphoglycerate Mutase 336 

9. Enolase 338 

lO.Pryuvate Kinase 338 



11.3 The Fate of Pryuvate 338 

A. Metabolism of Pryuvate to Ethanol 339 

B. Reduction of Pyruvate to Lactate 340 

Box 11.4 The Lactate of the Long-Distance Runner 341 

11.4 Free Energy Changes in Glycolysis 341 

11.5 Regulation of Glycolysis 343 

A. Regulation of Hexose Transporters 344 

B. Regulation of Hexokinase 344 

Box 11.5 Glucose 6-Phosphate Has a Pivotal Metabolic Role in the Liver 345 

C. Regulation of Phosphofructokinase-1 345 

D. Regulation of Pyruvate Kinase 346 

E. The Pasteur Effect 347 

11.6 Other Sugars Can Enter Glycolysis 347 

A. Sucrose Is Cleaved to Monosaccharides 348 

B. Fructose Is Converted to Glyceraldehyde 3-Phosphate 348 

C. Galactose Is Converted to Glucose 1-Phosphate 349 
Box 11.6 A Secret Ingredient 349 

D. Mannose Is Converted to Fructose 6-Phosphate 351 

11.7 The Entner-Doudoroff Pathway in Bacteria 351 
Summary 352 

Problems 353 
Selected Readings 354 

12 Gluconeogenesis, the Pentose Phosphate Pathway, 
and Glycogen Metabolism 355 

12.1 Gluconeogenesis 356 

A. Pyruvate Carboxylase 357 

B. Phosphoenolpyruvate Carboxykinase 358 

C. Fructose 1,6-b/sphosphatase 358 
Box 12.1 Supermouse 359 

D. Glucose 6-Phosphatase 359 

12.2 Precursors for Gluconeogenesis 360 

A. Lactate 360 

B. Amino Acids 360 

C. Glycerol 361 

D. Propionate and Lactate 361 

E. Acetate 362 

Box 12.2 Glucose Is Sometimes Converted to Sorbitol 362 

12.3 Regulation of Gluconeogenesis 363 

Box 12.3 The Evolution of a Complex Enzyme 364 

12.4 The Pentose Phosphate Pathway 364 

A. Oxidative Stage 366 

B. Nonoxidative Stage 364 

Box 12.4 Glucose 6-Phosphate Dehydrogenase Deficiency in Humans 367 

C. Interconversions Catalyzed by Transketolase and Transaldolase 368 

12.5 Glycogen Metabolism 368 

A. Glycogen Synthesis 369 

B. Glycogen Degradation 370 

12.6 Regulation of Glycogen Metabolism in Mammals 372 

0 ? 




m • 






A. Regulation of Glycogen Phosphorylase 372 
Box 12.5 Head Growth and Tail Growth 373 

B. Hormones Regulate Glycogen Metabolism 375 

C. Hormones Regulate Gluconeogenesis and Glycolysis 376 

12.7 Maintenance of Glucose Levels in Mammals 378 

12.8 Glycogen Storage Diseases 381 
Summary 382 

Problems 382 
Selected Readings 383 

1 3 The Citric Acid Cycle 385 

Box 13.1 An Egregious Error 386 

13.1 Conversion of Pyruvate to Acetyl CoA 387 

Sample Calculation 13.1 390 

13.2 The Citric Acid Cycle Oxidizes Acetyl CoA 391 
Box 13.2 Where Do the Electrons Come From? 392 

13.3 The Citric Acid Cycle Enzymes 394 

1. Citrate Synthase 394 

Box 13.3 Citric Acid 396 

2. Aconitase 396 

Box 13.4 Three-Point Attachment of Prochiral Substrates to Enzymes 

3. Isocitrate Dehydrogenase 397 

4. The a-Ketoglutarate Dehydrogenase Complex 398 

5. Succinyl CoA Synthetase 398 

6. Succinate Dehydrogenase Complex 399 
Box 13.5 What’s in a Name? 399 

Box 13.6 On the Accuracy of the World Wide Web 401 

7 . Fumarase 401 

8. Malate Deydrogenase 401 

Box 13.7 Converting One Enzyme into Another 402 

13.4 Entry of Pyruvate Into Mitochondria 402 

13.5 Reduced Coenzymes Can Fuel the Production of ATP 405 

13.6 Regulation of the Citric Acid Cycle 406 

13.7 The Citric Acid Cycle Isn’t Always a "Cycle” 407 
Box 13.8 A Cheap Cancer Drug? 408 

13.8 The Glyoxylate Pathway 409 

13.9 Evolution of the Citric Acid Cycle 412 
Summary 414 

Problems 414 
Selected Readings 416 

14 Electron Transport and ATP Synthesis 417 

14.1 Overview of Membrane-associated Electron Transport 

and ATP Synthesis 418 

14.2 The Mitochondrion 418 

Box 14.1 An Exception to Every Rule 420 

14.3 The Chemiosmotic Theory and the Protonmotive Force 420 

A. Historical Background: The Chemiosmotic Theory 420 

B. The Protonmotive Force 421 




14.4 Electron Transport 423 

A. Complexes I Through IV 423 

B. Cofactors in Electron Transport 425 

14.5 Complex I 426 

14.6 Complex II 427 

14.7 Complex III 428 

14.8 Complex IV 431 

14.9 Complex V: ATP Synthase 433 

Box 14.2 Proton Leaks and Heat Production 435 

14.10 Active Transport of ATP, ADP, and Pj Across 
the Mitochondrial Membrane 435 

14.11 The P/O Ratio 436 

14.12 NADH Shuttle Mechanisms in Eukaryotes 436 
Box 14.3 The High Cost of Living 439 

14.13 Other Terminal Electron Acceptors and Donors 439 

14.14 Superoxide Anions 440 
Summary 441 
Problems 441 
Selected Readings 442 

1 5 Photosynthesis 443 

15.1 Light-Gathering Pigments 444 

A. The Structures of Chlorophylls 444 

B. Light Energy 445 

C. The Special Pair and Antenna Chlorophylls 446 
Box 15.1 Mendel’s Seed Color Mutant 447 

D. Accessory Pigments 447 

15.2 Bacterial Photosystems 448 

A. Photosystem II 448 

B. Photosystem I 450 

C. Coupled Photosystems and Cytochrome bf 453 

D. Reduction Potentials and Gibbs Free Energy in Photosynthesis 455 

E. Photosynthesis Takes Place Within Internal Membranes 457 
Box 15.2 Oxygen “Pollution” of Earth’s Atmosphere 457 

15.3 Plant Photosynthesis 458 

A. Chloroplasts 458 

B. Plant Photosystems 459 

C. Organization of Cloroplast Photosystems 459 
Box 15.3 Bacteriorhodopsin 461 

15.4 Fixation of C0 2 : The Calvin Cycle 461 

A. The Calvin Cycle 462 

B. Rubisco: R ibu lose 1,5-b/sphosphate Carboxylase-oxygenase 462 

C. Oxygenation of Ribulose 1,5-b/sphosphate 465 
Box 15.4 Building a Better Rubisco 466 

D. Calvin Cycle: Reduction and Regeneration Stages 466 

15.5 Sucrose and Starch Metabolism in Plants 467 
Box 15.5 Gregor Mendel’s Wrinkled Peas 469 

15.6 Additional Carbon Fixation Pathways 469 
A. Compartmentalization in Bacteria 469 


B. The C 4 Pathway 469 

C. Crassulacean Acid Metabolism (CAM) 471 
Summary 472 

Problems 473 
Selected Readings 474 

16 Lipid Metabolism 475 

16.1 Fatty Acid Synthesis 475 

A. Synthesis of Malonyl ACP and Acetyl ACP 476 

B. The Initiation Reaction of Fatty Acid Synthesis 477 

C. The Elongation Reactions of Fatty Acid Synthesis 477 

D. Activation of Fatty Acids 479 

E. Fatty Acid Extension and Desaturation 479 

16.2 Synthesis of Triacylglycerols and Glycerophospholipids 481 

16.3 Synthesis of Eicosanoids 483 

Box 16.1 s/7-G lycerol 3-Phosphate 484 

Box 16.2 The Search for a Replacement for Asprin 486 

16.4 Synthesis of Ether Lipids 487 

16.5 Synthesis of Sphingolipids 488 

16.6 Synthesis of Cholesterol 488 

A. Stage 1: Acetyl CoA to Isopentenyl Diphosphate 488 

B. Stage 2: Isopentenyl Diphosphate to Squalene 488 

C. Stage 3: Squalene to Cholesterol 490 

D. Other Products of Isoprenoid Metabolism 490 
Box 16.3 Lysosomal Storage Diseases 492 

Box 16.4 Regulating Cholesterol Levels 493 

16.7 Fatty Acid Oxidation 494 

A. Activation of Fatty Acids 494 

B. The Reactions of p-Oxidation 494 

C. Fatty Acid Synthesis and p-Oxidation 497 

D. Transport of Fatty Acyl CoA into Mitochondria 497 
Box 16.5 A Trifunctional Enzyme for p-Oxidation 498 

E. ATP Generation from Fatty Acid Oxidation 498 

F. p-Oxidation of Odd-Chain and Unsaturated Fatty Acids 499 

16.8 Eukaryotic Lipids Are Made at a Variety of Sites 501 

16.9 Lipid Metabolism Is Regulated by Hormones in Mammals 502 

16.10 Absorption and Mobilization of Fuel Lipids in Mammals 505 

A. Absorption of Dietary Lipids 505 

B. Lipoproteins 505 

Box 16.6 Extra Virgin Olive Oil 506 

Box 16.7 Lipoprotein Lipase and Coronary Heart Disease 507 

C. Serum Albumin 508 

16.11 Ketone Bodies Are Fuel Molecules 508 

A. Ketone Bodies Are Synthesized in the Liver 509 

B. Ketone Bodies Are Oxidized in Mitochondria 510 
Box 16.8 Lipid Metabolism in Diabetes 511 
Summary 511 

Problems 511 
Selected Readings 513 


17 Amino Acid Metabolism 514 

17.1 The Nitrogen Cycle and Nitrogen Fixation 515 

17.2 Assimilation of Ammonia 518 

A. Ammonia Is Incorporated into Glutamate and Glutamine 518 

B. Transamination Reactions 518 

17.3 Synthesis of Amino Acids 520 

A. Aspartate and Asparagine 520 

B. Lysine, Methionine, Threonine 520 

C. Alanine, Valine, Leucine, and Isoleucine 521 

Box 17.1 Childhood Acute Lymphoblastic Leukemia Can Be Treated 
with Asparaginase 522 

D. Glutamate, Glutamine, Arginine, and Proline 523 

E. Serine, Glycine, and Cysteine 523 

F. Phenylalanine, Tyrosine, and Tryptophan 523 

G. Histidine 527 

Box 17.2 Genetically Modified Food 528 

Box 17.3 Essential and Nonessential Amino Acids in Animals 529 

17.4 Amino Acids as Metabolic Precursors 529 

A. Products Derived from Glutamate, Glutamine, and Aspartate 529 

B. Products Derived from Serine and Glycine 529 

C. Synthesis of Nitric Oxide from Arginine 530 

D. Synthesis of Lignin from Phenylalanine 531 

E. Melanin Is Made from Tyrosine 531 

17.5 Protein Turnover 531 

Box 17.4 Apoptosis-Programmed Cell Death 534 

17.6 Amino Acid Catabolism 534 

A. Alanine, Asparagine, Aspartate, Glutamate, and Glutamine 535 

B. Arginine, Histidine, and Proline 535 

C. Glycine and Serine 536 

D. Threonine 537 

E. The Branched Chain Amino Acids 537 

F. Methionine 539 

Box 17.5 Phenylketonuria, a Defect in Tyrosine Formation 540 

G. Cysteine 540 

H. Phenylalanine, Tryptophane, and Tyrosine 541 

I. Lysine 542 

17.7 The Urea Cycle Converts Ammonia into Urea 542 

A. Synthesis of Carbamoyl Phosphate 543 

B. The Reactions of the Urea Cycle 543 

Box 17.6 Diseases of Amino Acid Metabolism 544 

C. Ancillary Reactions of the Urea Cycle 547 

17.8 Renal Glutamine Metabolism Produces Bicarbonate 547 
Summary 548 

Problems 548 
Selected Readings 549 

18 Nucleotide Metabolism 550 

18.1 Synthesis of Purine Nucleotides 550 

Box 18.1 Common Names of the Bases 552 

18.2 Other Purine Nucleotides Are Synthesized from IMP 554 

18.3 Synthesis of Pyrimidine Nucleotides 555 


A. The Pathway for Pyrimidine Synthesis 556 

Box 18.2 How Some Enzymes Transfer Ammonia from Glutamate 558 

B. Regulation of Pyrimidine Synthesis 559 

18.4 CTP Is Synthesized from UMP 559 

18.5 Reduction of Ribonucleotides to Deoxyribonucleotides 560 

18.6 Methylation of dUMP Produces dTMP 560 

Box 18.3 Free Radicals in the Reduction of Ribonucleotides 562 
Box 18.4 Cancer Drugs Inhibit dTTP Synthesis 564 

18.7 Modified Nucleotides 564 

18.8 Salvage of Purines and Pyrimidines 564 

18.9 Purine Catabolism 565 

18.10 Pyrimidine Catabolism 568 

Box 18.5 Lesch-Nyhan Syndrome and Gout 569 
Summary 571 
Problems 571 
Selected Readings 572 


Biological Information Flow 

19 Nucleic Acids 573 

19.1 Nucleotides Are the Building Blocks of Nucleic Acids 574 

A. Ribose and Deoxyribose 574 

B. Purines and Pyrimidines 574 

C. Nucleosides 575 

D. Nucleotides 577 

19.2 DNA Is Double-Stranded 579 

A. Nucleotides Are Joined by 3'-5' Phosphodiester Linkages 580 

B. Two Antiparallel Strands Form a Double Helix 581 

C. Weak Forces Stabilize the Double Helix 583 

D. Conformations of Double-Stranded DNA 585 

19.3 DNA Can Be Supercoiled 586 

19.4 Cells Contain Several Kinds of RNA 587 
Box 19.1 Pulling DNA 588 

19.5 Nucleosomes and Chromatin 588 

A. Nucleosomes 588 

B. Higher Levels of Chromatin Structure 590 

C. Bacterial DNA Packaging 590 

19.6 Nucleases and Hydrolysis of Nucleic Acids 591 

A. Alkaline Hydrolysis of RNA 591 

B. Hydrolysis of RNA by Ribonuclease A 592 

C. Restriction Endonucleases 593 

D. EcoR\ Binds Tightly to DNA 595 

19.7 Uses of Restriction Endocucleases 596 

A. Restriction Maps 596 

B. DNA Fingerprints 596 

C. Recombinant DNA 597 
Summary 598 
Problems 599 

Selected Readings 599 


20 DNA Replication, Repair, and Recombination 601 

20.1 Chromosomal DNA Replication Is Bidirectional 602 

20.2 DNA Polymerase 603 

A. Chain Elongation Isa Nucleotidyl-Group-Transfer Reaction 604 

B. DNA Polymerase III Remains Bound to the Replication Fork 606 

C. Proofreading Corrects Polymerization Errors 607 

20.3 DNA Polymerase Synthesizes Two Strands Simultaneously 607 

A. Lagging Strand Synthesis Is Discontinuous 608 

B. Each Okazaki Fragment Begins with an RNA Primer 608 

C. Okazaki Fragments Are Joined by the Action of DNA Polymerase I 
and DNA Ligase 609 

20.4 Model of the Replisome 610 

20.5 Initiation and Termination of DNA Replication 615 

20.6 DNA Replication in Eukaryotes 615 

A. The Polymerase Chain Reaction Uses DNA Polymerase to 
Amplify Selected DNA Sequences 615 

B. Sequencing DNA Using Dideoxynucleotides 616 

C. Massively Parallel DNA Sequencing by Synthesis 618 

20.7 DNA Replication in Eukaryotes 619 

20.8 Repair of Damaged DNA 622 

A. Repair after Photodimerization: An Example of Direct Repair 622 

B. Excision Repair 624 

BOX 20.1 The Problem with Methylcytosine 626 

20.9 Homologous Recombination 626 

A. The Holliday Model of General Recombination 626 

B. Recombination in E. coli 627 

BOX 20.2 Molecular Links Between DNA Repair and Breast Cancer 630 

C. Recombination Can Be a Form of Repair 631 
Summary 631 

Problems 632 
Selected Readings 632 

21 Transcription and RNA Processing 633 

21.1 Types of RNA 634 

21.2 RNA Polymerase 635 

A. RNA Polymerase Is an Oligomeric Protein 635 

B. The Chain Elongation Reaction 636 

21.3 Transcription Initiation 638 

A. Genes Have a 5'^3' Orientation 638 

B. The Transcription Complex Assembles at a Promoter 639 

C. The a sigma Subunit Recognizes the Promoter 640 

D. RNA Polymerase Changes Conformation 641 

21.4 Transcription Termination 643 

21.5 Transcription in Eukaryotes 645 

A. Eukaryotic RNA Polymerases 645 

B. Eukaryotic Transcription Factors 647 

C. The Role of Chromatin in Eukaryotic Transcription 648 

21.6 Transcription of Genes Is Regulated 648 

21.7 The lac Operon, an Example of Negative and Positive Regulation 650 

A. lac Repressor Blocks Transcription 650 

B. The Structure of lac Repressor 651 


C. cAMP Regulatory Protein Activates Transcription 652 

21.8 Post-transcriptional Modification of RNA 654 

A. Transfer RNA Processing 654 

B. Ribosomal RNA Processing 655 

21.9 Eukaryotic mRNA Processing 655 

A. Eukaryotic mRNA Molecules Have Modified Ends 657 

B. Some Eukaryotic mRNA Precursors Are Spliced 657 
Summary 663 

Problems 663 
Selected Readings 664 

22 Protein Synthesis 665 

22.1 The Genetic Code 665 

22.2 Transfer RNA 668 

A. The Three-Dimensional Structure of tRNA 668 

B. tRNA Anticodons Base-Pair with mRNA Codons 669 

22.3 Aminoacyl-tRNA Synthetases 670 

A. The Aminoacyl-tRNA Synthetase Reaction 671 

B. Specificity of Aminoacyl-tRNA Synthetases 671 

C. Proofreading Activity of Aminoacyl-tRNA Synthetases 673 

22.4 Ribosomes 673 

A. Ribosomes Are Composed of Both Ribosomal RNA and Protein 674 

B. Ribosomes Contain Two Aminoacyl-tRNA Binding Sites 675 

22.5 Initiation of Translation 675 

A. Initiator tRNA 675 

B. Initiation Complexes Assemble Only at Initiation Codons 676 

C. Initiation Factors Help Form the Initiation Complex 677 

D. Translation Initiation in Eukaryotes 679 

22.6 Chain Elongation During Protein Synthesis Is a Three-Step Microcycle 679 

A. Elongation Factors Dock an Aminoacyl-tRNA in the A Site 680 

B. Peptidyl Transferase Catalyzes Peptide Bond Formation 681 

C. Translocation Moves the Ribosome by One Codon 682 

22.7 Termination of Translation 684 

22.8 Protein Synthesis Is Energetically Expensive 684 

22.9 Regulation of Protein Synthesis 685 

A. Ribosomal Protein Synthesis Is Coupled to Ribosome 
Assembly in E. coli 685 

Box 22.1 Some Antibiotics Inhibit Protein Synthesis 686 

B. Globin Synthesis Depends on Heme Availability 687 

C. The E. coli trp Operon Is Regulated by Repression and Attenuation 687 

22.10 Post-translational Processing 689 

A. The Signal Hypothesis 691 

B. Glycosylation of Proteins 694 
Summary 694 

Problems 695 
Selected Readings 696 
Solutions 697 
Glossary 751 
Illustration Credits 767 
Index 769 

To the Student 

Welcome to biochemistry — the study of life at the molecular level. As you venture into 
this exciting and dynamic discipline, you’ll discover many new and wonderful things. 
You’ll learn how some enzymes can catalyze chemical reactions at speeds close to theo- 
retical limits — reactions that would otherwise occur only at imperceptibly low rates. 
You’ll learn about the forces that maintain biomolecular structure and how even some 
of the weakest of those forces make life possible. You’ll also learn how biochemistry has 
thousands of applications in day-to-day life — in medicine, drug design, nutrition, 
forensic science, agriculture, and manufacturing. In short, you’ll begin a journey of dis- 
covery about how biochemistry makes life both possible and better. 

Before we begin, we would like to offer a few words of advice: 

Don’t just memorize facts; instead, understand principles 

In this book, we have tried to identify the most important principles of biochemistry. 
Because the knowledge base of biochemistry is continuously expanding, we must grasp 
the underlying themes of this science in order to understand it. This textbook is de- 
signed to expand on the foundation you have acquired in your chemistry and biology 
courses and to provide you with a biochemical framework that will allow you to under- 
stand new phenomena as you meet them. 

Be prepared to learn a new vocabulary 

An understanding of biochemical facts requires that you learn a biochemical vocabu- 
lary. This vocabulary includes the chemical structures of a number of key molecules. 
These molecules are grouped into families based on their structures and functions. You 
will also learn how to distinguish among members of each family and how small mole- 
cules combine to form macromolecules such as proteins and nucleic acids. 

Test your understanding 

True mastery of biochemistry lies with learning how to apply your knowledge and how 
to solve problems. Each chapter concludes with a set of carefully crafted problems that 
test your understanding of core principles. Many of these problems are mini case stud- 
ies that present the problem within the context of a real biochemical puzzle. 

For more practice, we are pleased to refer you to The Study Guide for Principles of 
Biochemistry by Scott Lefler and Allen Seism which presents a variety of supplementary 
questions that you may find helpful. You will also find additional problems on 
TheChemistryPlace® for Principles of Biochemistry ( 

Learn to visualize in 3-D 

Biochemicals are three-dimensional objects. Understanding what happens in a bio- 
chemical reaction at the molecular level requires that you be able to “see” what happens 
in three dimensions. We present the structures of simple molecules in several different 
ways in order to illustrate their three-dimensional conformation. In addition to the art 
in the book, you will find many animations and interactive molecular models on the 
website. We strongly suggest you look at these movies and do the exercises that accom- 
pany them as well as participate in the molecular visualization tutorials. 


Finally, please let us know of any errors or omissions you encounter as you use this text. 
Tell us what you would like to see in the next edition. With your help we will continue to 
evolve this work into an even more useful tool. Our e-mail addresses are at the end of 
the Preface. Good luck, and enjoy! 

This page intentionally left blank 


Given the breadth of coverage and diversity of ways to present topics in biochemistry, 
we have tried to make the text as modular as possible to allow for greater flexibility and 
organization. Each large topic resides in its own section. Reaction mechanisms are often 
separated from the main thread of the text and can be passed over by those who prefer 
not to cover this level of detail. The text is extensively cross-referenced to make it easier 
for you to reorganize the chapters and for students to see the interrelationships among 
various topics and to drill down to deeper levels of understanding. 

We built the book explicitly for the beginning student taking a first course in bio- 
chemistry with the aim of encouraging students to think critically and to appreciate 
scientific knowledge for its own sake. Parts One and Two lay a solid foundation of 
chemical knowledge that will help students understand, rather than merely memo- 
rize, the dynamics of metabolic and genetic processes. These sections assume that stu- 
dents have taken prerequisite courses in general and organic chemistry and have ac- 
quired a rudimentary knowledge of the organic chemistry of carboxylic acids, 
amines, alcohols, and aldehydes. Even so, key functional groups and chemical proper- 
ties of each type of biomolecule are carefully explained as their structures and func- 
tions are presented. 

We also assume that students have previously taken a course in biology where they 
have learned about evolution, cell biology, genetics, and the diversity of life on this 
planet. We offer brief refreshers on these topics wherever possible. 

New to this Edition 

We are grateful for all the input we received on the first four editions of this text. You’ll 
notice the following improvements in this fifth edition: 

• Key Concept margin notes are provided throughout to highlight key concepts and 
principles that students must know. 

• Interest Boxes have been updated and expanded, with 45% new to the fifth edition. 
We use interest boxes to explain some topics in more detail, to illustrate certain prin- 
ciples with specific examples, to stimulate students curiosity about science, to show 
applications of biochemistry, and to explain clinical relevance. We have also added a 
few interests boxes that warn students about misunderstanding and misapplications 
of biochemistry. Examples include Blood Plasma and Sea Water; Fossil Dating by 
Amino Acid Racemization; Embryonic and Fetal Hemoglobins; Clean Clothes; The 
Perfect Enzyme; Supermouse; The Evolution of a Complex Enzyme; An Egregious 
Error; Mendels Seed Color Mutant; Oxygen Pollution of Earth’s Atmosphere; Extra 
Virgin Olive Oil; Missing Vitamins; Pulling DNA; and much more. 

• New Material has been added throughout, including an improved explanation of 
early evolution (the Web of Life), more emphasis on protein protein interactions, a 
new section on intrinsically disordered proteins, and a better description of the dis- 
tinction between Gibbs free energy changes and reaction rates. We have removed 
the final chapter on Recombinant DNA Technology and integrated much of that 
material into earlier chapters. We have added descriptions of a number of new pro- 
tein structures and integrated them into two major themes: structure- function and 
multienzyme complexes. The best example is the fatty acid synthase complex in 
Chapter 16. 

In some cases new material was necessary because recent discoveries have 
changed our view of some reactions and processes. We now know, for example, that 
older versions of uric acid catabolism were incorrect, the correct pathway is shown in 
Figure 18.23. 


We have been careful not to add extra detail unless it supports and extends the 
basic concepts and principles that we have established over the past four editions. 
Similarly, we do not introduce new subjects unless they illustrate new concepts that 
were not covered in previous editions. The goal is to keep this textbook focused on 
the fundamentals that students need to know and prevent it from bloating up into 
an encyclopedia of mostly irrelevant information that detracts from the main 
pedagogical goals. 

• Selected Readings after each chapter reflect the most current literature and these 
have been updated and extended where necessary. We have added over 120 new 
references and deleted many that are no longer appropriate. Although we have al- 
ways included references to the pedagogical literature, you will note that we have 
added quite a few more references of this type. Students now have easy access to 
these papers and they are often more informative than advanced papers in the 
purely scientific literature. 

• Art is an important component of a good textbook. Our art program has been ex- 
tensively revised, with many new photos to illustrate concepts explained in the text; 
new and updated ribbon art, and improved versions of many figures. Many of the 
new photos are designed to attract and/or hold the students attention. They can be 
powerful memory aids and some of them are used to lighten up the subject in a 
way that is rarely seen in other textbooks (see page 204). We believe that the look 
and feel of the book has been much improved, making it more appealing to stu- 
dents without sacrificing any of the rigor and accuracy that has been a hallmark of 
previous editions. 

A focus on principles 

There are, in essence, two kinds of biochemistry textbooks: those for reference and 
those for teaching. It is difficult for one book to be both as it is those same thickets of 
detail sought by the professional that ensnare the struggling novice on his or her first 
trip through the forest. This text is unapologetically a text for teaching. It has been de- 
signed to foster student understanding and is not an encyclopedia of biochemistry. This 
book focuses unwaveringly on teaching basic principles and concepts, each principle 
supported by carefully chosen examples. We really do try to get students to see the forest 
and not the trees! 

Because of this focus, the material in this book can be covered in a two-semester 
course without having to tell students to skip certain chapters or certain sections. The 
book is also suitable for a one-semester course that concentrates on certain aspects of 
biochemistry where some subjects are not covered. Instructors can be confident that the 
core principles and concepts are explained thoroughly and correctly. 

A focus on chemistry 

When we first wrote this text, we decided to take the time to explain in chemical terms 
the principles that we want to emphasize. In fact, one of these principles is to show stu- 
dents that life obeys the fundamental laws of physics and chemistry. To that end, we 
offer chemical explanations of most biochemical reactions, including mechanisms that 
tell students how and why things happen. 

We are particularly proud of our explanations of oxidation-reduction reactions 
since these are extremely important in so many contexts. We describe electron move- 
ments in the early chapters, explain reduction potentials in Chapter 10 and use this un- 
derstanding to teach about chemiosmotic theory and protonmotive force in Chapter 14 
(Electron Transport and ATP Synthesis). The concept is reinforced in the chapter on 

A focus on biology 

While we emphasize chemistry, we also stress the bio in biochemistry. We point out that 
biochemical systems evolve and that the reactions that occur in some species are varia- 
tions on a larger theme. In this edition, we increase our emphasis on the similarities of 


prokaryotic and eukaryotic systems while we continue to avoid making generalizations 
about all organisms based on reactions that occur in a few. 

The evolutionary, or comparative, approach to teaching biochemistry focuses at- 
tention on fundamental concepts. The evolutionary approach differs in many ways 
from other pedagogical methods such as an emphasis on fuel metabolism. The evolu- 
tionary approach usually begins with a description of simple fundamental principles or 
pathways or processes. These are often the pathways found in bacteria. As the lesson 
proceeds, the increasing complexity seen in some other species is explained. At the end 
of a chapter we are ready to describe the unique features of the process found in com- 
plex multicellular species, such as humans. 

Our approach entails additional changes that distinguish us from other textbooks. 
When introducing a new chapter, such as lipid metabolism, amino acid metabolism, 
and nucleotide metabolism, most other textbooks begin by treating the molecules as 
potential food for humans. We start with the biosynthesis pathways since those are the 
ones fundamental to all organisms. Then we describe the degradation pathways and end 
with an explanation of how they realte to fuel metabolism. This biosynthesis first or- 
ganization applies to all the major components of a cell (proteins, nucleotides, nucleic 
acids, lipids, amino acids) except carbohydrates where we continue to describe glycoly- 
sis ahead of gluconeogenesis. We do, however, emphasize that gluconeogenesis is the 
original, primitive pathway and glycolysis evolved later. 

This has always been the way DNA replication, transcription, and translation have 
been taught. In this book we extend this successful strategy to all the other topics in bio- 
chemistry. The chapter on photosynthe sis is an excellent example of how it works in 

In some cases the emphasis on evolution can lead to a profound appreciation of 
how complex systems came to exist. Take the citric acid cycle as an example. Students 
are often told that such a process cannot be the product of evolution because all the 
parts are needed before the cycle can function. We explain in Section 13.9 how such a 
pathway can evolve in a stepwise manner. 

A focus on accuracy 

We are proud of the fact that this is the most scientifically accurate biochemistry text- 
book. We have gone to great lengths to ensure that our facts are correct and our explana- 
tions of basic concepts reflect the modern consensus among active researchers. Our suc- 
cess is due, in large part, to the dedication of our many reviewers and editors. 

The emphasis on accuracy means that we check our reactions and our nomencla- 
ture against the IUPAC/IUBMB databases. The result is balanced reactions with correct 
products and substrates and correct chemical nomenclature. For example, we are one of 
the very few textbooks that show all of the citric acid cycle reactions correctly. Previous 
editions of this textbook have always scored highly on the Biochemical Howlers website 
[] and we feel confident that this edition will achieve a per- 
fect score! 

We take the time and effort to accurately describe some difficult concepts such as 
Gibbs free energy change in a steady-state situation where most reactions are near- 
equlibirium reactions (AG = 0). We present correct definitions of the Central Dogma of 
Molecular Biology. We don’t avoid genuine areas of scientific controversy such as the 
validity of the Three Domain Hypothesis or the mechanism of lysozyme. 

A focus on structure-function 

Biochemistry is a three-dimensional science. Our inclusion of the latest computer gen- 
erated images is intended to clarify the shape and function of molecules and to leave 
students with an appreciation for the relationship between the structure and function. 
Many of the protein images in this edition are new; they have been skillfully prepared by 
Jonathan Parrish of the University of Alberta. 

We offer a number of other opportunities. For those students with access to a com- 
puter, we have included Protein Data Bank (PDB) reference numbers for the coordinates 

from which all protein images were derived. This allows students to further explore the 
structures on their own. In addition, we have a gallery of prepared PDB files that stu- 
dents can view using Chime or any other molecular viewer; these are posted on the 
text’s TheChemistryPlace® website [] as are animations of key dynamic 
processes as well as visualization tutorials using Chime. 

The emphasis on protein/enzyme structure is a key part of the theme of structure- 
function that is one of the most important concepts in biochemistry. At various places 
in this new edition we have added material to emphasize this relationship and to develop 
it to a greater extent than we have in the past. Some of the most important reactions in 
the cell, such as the Q- cycle, cannot be properly understood without understanding the 
structure of the enzyme that catalyzes them. Similarly, understanding the properties of 
double-stranded DNA is essential to understanding how it serves as the storehouse of 
biological information. 

Walkthrough of features with some visuals 


Biochemistry is at the root of a number of related sciences, including medicine, forensic 
science, biotechnology, and bioengineering; there are many interesting stories to tell. 
Throughout the text, you will find boxes that relate biochemistry to other topics. Some 
of them are intended to be humorous and help students relate to the material. 


One of the characteristics of sugars is that they taste sweet. 
You certainly know the taste of sucrose and you probably 
know that fructose and lactose also taste sweet. So do many 
of the other sugars and their derivatives, although we don’t 
recommend that you go into a biochemistry lab and start 
tasting all the carbohydrates in those white plastic bottles on 
the shelves. 

Sweetness is not a physical property of molecules. It’s a 
subjective interaction between a chemical and taste receptors 
in your mouth. There are five different kinds of taste recep- 
tors: sweet, sour, salty, bitter, and umami (umami is like the 
taste of glutamate in monosodium glutamate). In order to 
trigger the sweet taste, a molecule like sucrose has to bind to 
the receptor and initiate a response that eventually makes it 
to your brain. Sucrose elicits a moderately strong response 
that serves as the standard for sweetness. The response to 
fructose is almost twice as strong and the response to lactose 
is only about one-fifth as strong as that of sucrose. Artificial 
sweeteners such as saccharin (Sweet’N Low®), sucralose 

(Splenda®), and aspartame (NutraSweet®) bind to the sweet- 
ness receptor and cause the sensation of sweetness. They are 
hundreds of times more sweet than sucrose. 

The sweetness receptor is encoded by two genes called 
Tasl r2 and Tasl r3. We don’t know how sucrose and the other 
ligands bind to this receptor even though this is a very active 
area of research. In the case of sucrose and the artifical sweet- 
eners, how can such different molecules elicit the taste of 

Cats, including lions, tigers and cheetahs, do not have a 
functional Taslr2 gene. It has been converted to a pseudo- 
gene because of a 247 bp deletion in exon 3. It’s very likely 
that your pet cat has never experienced the taste of sweetness. 
That explains a lot about cats. 


▲ Cats are carnivores. They probably can’t 
taste sweetness. 


Key Concepts 

To help guide students to the information important in each concept, Key Concept 
notes have been provided in the margin highlighting this information. 

Complete Explanations of the Chemistry 

There are thousands of metabolic reactions in a typical organism. You might try to 
memorize them all but eventually you will run out of memory. What’s more, memo- 
rization will not help you if you encounter something you haven’t seen before. In this 
book, we show you some of the basic mechanisms of enzyme -catalyzed reactions — an 
extension of what you learned in organic chemistry. If you understand the mechanism, 
you’ll understand the chemistry. You’ll have less to memorize, and you’ll retain the in- 
formation more effectively. 

Margin Notes 

There is a great deal of detail in biochemistry but we want you to see both the forest and 
the trees. When we need to cross-reference something discussed earlier in the book, or 
something that we will come back to later, we put it in the margin. Backward references 
offer a review of concepts you may have forgotten. Forward references will help you see 
the big picture. 


Biochemistry is a three-dimensional science and we have placed a great emphasis on help- 
ing you visualize abstract concepts and molecules too small to see. We have tried to make 
illustrative figures both informative and beautiful. 


The standard Gibbs free energy change 
(A G°') tells us the direction of a reaction 
when the concentrations of all products 
and reactants are at 1 M concentration. 
These conditions will never occur in living 
cells. Biochemists are only interested in 
actual Gibbs free energy changes (A G), 
which are usually close to zero. The 
standard Gibbs free energy change (AG°') 
tells us the relative concentrations of 
reactants and products when the reaction 
reaches equilibrium. 

The distinction between the normal 
flow of information and the Central 
Dogma of Molecular Biology is 
explained in Section 1.1 and the intro- 
duction to Chapter 21. 


e © V Ferredoxin 
^ or 


Cytochrome c 



Sample Calculations 

Sample Calculations are included throughout the text to provide a problem solving 
model and illustrate required calculations. 

SAMPLE CALCULATION 10.2 Gibbs Free Energy Change 

Q: In a rat hepatocyte, the concentrations of ATP, ADP, and the Gibbs free energy change for hydrolysis of ATP in this cell. 

Pj are 3.4 mM, 1.3 mM, and 4.8 mM, respectively. Calculate How does this compare to the standard free energy change? 

A: The actual Gibbs free energy change is calculated according to Equation 10.10. 

[ADP][Pi] [ADP][Pj] 

AG rea ction= AG°' reaction + RT In = AG° rea ction + 2.303 RT\og 

[ATP] [ATP] 

When known values and constants are substituted (with concentrations expressed as molar values), assuming pH7.0 and 25°C. 

, , , r (1.3 X 10 _3 )(4.8 X 10 -3 )1 

AG = -32000 ] mol” 1 + (8.31 JK“ 1 mol“ 1 )(298 K) 2.303 log 

L (3.4 X 10 ) J 

AG = -32000 ] mol” 1 + (2480 ] mol -1 ) [2.303 log (1.8 x 10“ 3 )] 

AG = -32000 ] mol” 1 - 16000 ] mol” 1 

AG = -48000 J mol -1 = -48 kj mol -1 

The actual free energy change is about 1 V2 times the standard free energy change. 

The Organization 

We adopt the metabolism-first strategy of organizing the topics in this book. This means 
we begin with proteins and enzymes then describe carbohydrates and lipids. This is fol- 
lowed by a description of intermediary metabolism and bioenergetics. The structure of 
nucleic acids follows the chapter on nucleotide metabolism and the information flow 
chapters are at the back of the book. 

While we believe there are significant advantages to teaching the subjects in this 
order, we recognize that some instructors prefer to teach information flow earlier in the 
course. We have tried to make the last four chapters on nucleic acids, DNA replication, 
transcription, and translation less dependant on the earlier chapters but they do discuss 
aspects of enzymes that rely on Chapters 4, 5 and 6. Instructors may choose to intro- 
duce these last four chapters after a description of enzymes if they wish. 

This book has a chapter on coenzymes unlike most other biochemistry textbooks. 
We believe that it is important to put more emphasis on the role of coenzymes (and 
vitamins) and that’s why we have placed this chapter right after the two chapters on en- 
zymes. We know that most instructors prefer to teach the individual coenzymes when 
specific examples come up in other contexts. We do that as well. This organization al- 
lows instructors to refer back to chapter 7 at whatever point they wish. 

Student Supplements 

The Study Guide for Principles of Biochemistry 

by Scott Lefler 

(Arizona State University) and 
Allen J. Seism 

(Central Missouri State University) 

No student should be without this helpful resource. Contents include the following: 

• carefully constructed drill problems for each chapter, including short-answer, multiple- 
choice, and challenge problems 

• comprehensive, step-by-step solutions and explanations for all problems 

• a remedial chapter that reviews the general and organic chemistry that students re- 
quire for biochemistry — topics are ingeniously presented in the context of a metabolic 

• tables of essential data 



Chemistry Place for Principles of Biochemistry 

An online student tool that includes 3-D modules to help visualize biochemistry and 
MediaLabs to investigate important issues related to its particular chapter. Please visit 
the site at 


We are grateful to our many talented and thoughtful reviewers who have helped shape this book. 

Reviewers who helped in the Fifth Edition: 

Accuracy Reviewers 

Barry Ganong, Mansfield University 

Scott Lefler, Arizona State 

Kathleen Nolta, University of Michigan 

Content Reviewers 

Michelle Chang, University of California, Berkeley 

Kathleen Comely, Providence College 

Ricky Cox, Murray State University 

Michel Goldschmidt- Clermont, University of Geneva 

Phil Klebba, University of Oklahoma, Norman 

Kristi McQuade, Bradley University 

Liz Roberts -Kirchoff, University of Detroit, Mercy 

Ashley Spies, University of Illinois 

Dylan Taatjes, University of Colorado, Boulder 

David Tu, Pennsylvania State University 

Jeff Wilkinson, Mississippi State University 

Lauren Zapanta, University of Pittsburgh 

Reviewers who helped in the Lourth Edition: 

Accuracy Reviewers 

Neil Haave, University of Alberta 
David Watt, University of Kentucky 

Content Reviewers 

Consuelo Alvarez, Longwood University 
Marilee Benore Parsons, University of Michigan 
Gary J. Blomquist, University of Nevada, Reno 
Albert M. Bobst, University of Cincinnati 
Kelly Drew, University of Alaska, Lairbanks 
Andrew Leig, Indiana University 
Giovanni Gadda, Georgia State University 
Donna L. Gosnell, Valdosta State University 
Charles Hardin, North Carolina State University 
Jane E. Hobson, Kwantlen University College 
Ramji L. Khandelwal, University of Saskatchewan 
Scott Lefler, Arizona State 
Kathleen Nolta, University of Michigan 

Jeffrey Schineller, Humboldt State University 

Richard Shingles, Johns Hopkins University 

Michael A. Sypes, Pennsylvania State University 

Martin T. Tuck, Ohio University 

Julio F. Turrens, University of South Alabama 

David Watt, University of Kentucky 

James Zimmerman, Clemson University 

Thank you to J. David Rawn who’s work laid the foundation 
for this text. We would also like to thank our colleagues who 
have previously contributed material for particular chapters 
and whose careful work still inhabits this book: 

Roy Baker, University of Toronto 
Roger W. Brownsey, University of British Columbia 
Willy Kalt, Agriculture Canada 
Robert K. Murray, University of Toronto 
Ray Ochs, St. John’s University 
Morgan Ryan, American Scientist 
Frances Sharom, University of Guelph 
Malcolm Watford, Rutgers, The State University of 
New Jersey 

Putting this book together was a collaborative effort, and 
we would like to thank various members of the team who have 
helped give this project life: Jonathan Parrish, Jay McElroy, Lisa 
Shoemaker, and the artists of Prentice Hall; Lisa Tarabokjia, 
Editorial Assistant, Jessica Neumann, Associate Editor, Lisa 
Pierce, Assistant Editor in charge of supplements, Lauren 
Layn, Media Editor, Erin Gardner, Marketing Manager; and 
Wendy Perez, Production Editor. We would also like to thank 
Jeanne Zalesky, our Executive Editor at Prentice Hall. 

Finally, we close with an invitation for feedback. Despite 
our best efforts (and a terrific track record in the previous edi- 
tions), there are bound to be mistakes in a work of this size. We 
are committed to making this the best biochemistry text avail- 
able; please know that all comments are welcome. 

Laurence A. Moran 
Marc D. Perry 

This page intentionally left blank 

About the Authors 

Laurence A. Moran 

After earning his Ph.D. from Princeton University in 1974, 
Professor Moran spent four years at the Universite de Geneve 
in Switzerland. He has been a member of the Department of 
Biochemistry at the University of Toronto since 1978, special- 
izing in molecular biology and molecular evolution. His re- 
search findings on heat-shock genes have been published in 
many scholarly journals. ( 

H. Robert Horton 

Dr. Horton, who received his Ph.D. from the University of Mis- 
souri in 1962, is William Neal Reynolds Professor Emeritus and 
Alumni Distinguished Professor Emeritus in the Department 
of Biochemistry at North Carolina State University, where he 
served on the faculty for over 30 years. Most of Professor Horton s 
research was in protein and enzyme mechanisms. 

K. Gray Scrimgeour 

Professor Scrimgeour received his doctorate from the Univer- 
sity of Washington in 1961 and was a faculty member at the 
University of Toronto for over 30 years. He is the author of 
The Chemistry and Control of Enzymatic Reactions (1977, Aca- 
demic Press), and his work on enzymatic systems has been 
published in more than 50 professional journal articles during 
the past 40 years. From 1984 to 1992, he was editor of the 
journal Biochemistry and Cell Biology. ( 

Marc D. Perry 

After earning his Ph.D. from the University of Toronto in 1988, 
Dr. Perry trained at the University of Colorado, where he stud- 
ied sex determination in the nematode C. elegans. In 1994 he 
returned to the University of Toronto as a faculty member in 
the Department of Molecular and Medical Genetics. His re- 
search has focused on developmental genetics, meiosis, and 
bioinformatics. In 2008 he joined the Ontario Institute for 
Cancer Research. ( 

New problems and solutions for the fifth edition were created by Laurence A. Moran, University of Toronto. The 
remaining problems were created by Drs. Robert N. Lindquist, San Francisco State University, Marc Perry, 
and Diane M. De Abreu of the University of Toronto. 


This page intentionally left blank 

Introduction to Biochemistry 

B iochemistry is the discipline that uses the principles and language of chemistry 
to explain biology. Over the past 100 years biochemists have discovered that the 
same chemical compounds and the same central metabolic processes are found 
in organisms as distantly related as bacteria, plants, and humans. It is now known that 
the basic principles of biochemistry are common to all living organisms. Although sci- 
entists usually concentrate their research efforts on particular organisms, their results 
can be applied to many other species. 

This book is called Principles of Biochemistry because we will focus on the most im- 
portant and fundamental concepts of biochemistry — those that are common to most 
species. Where appropriate, we will point out features that distinguish particular groups 
of organisms. 

Many students and researchers are primarily interested in the biochemistry of 
humans. The causes of disease and the importance of proper nutrition, for example, 
are fascinating topics in biochemistry. We share these interests and that’s why we in- 
clude many references to human biochemistry in this textbook. However, we will also 
try to interest you in the biochemistry of other species. As it turns out, it is often eas- 
ier to understand basic principles of biochemistry by studying many different species 
in order to recognize common themes and patterns but a knowledge and appreciation 
of other species will do more than help you learn biochemistry. It will also help you 
recognize the fundamental nature of life at the molecular level and the ways in which 
species are related through evolution from a common ancestor. Perhaps future edi- 
tions of this book will include chapters on the biochemistry of life on other planets. 
Until then, we will have to be satisfied with learning about the diverse life on our own 

We begin this introductory chapter with a few highlights of the history of biochem- 
istry, followed by short descriptions of the chemical groups and molecules you will en- 
counter throughout this book. The second half of the chapter is an overview of cell 
structure in preparation for your study of biochemistry. 

Anything found to be true of E. coli 
must also be true of elephants. 

— Jacques Monod 

Top: Adenovirus. Viruses consist of a nucleic acid molecule surrounded by a protein coat. 


2 CHAPTER 1 Introduction to Biochemistry 

▲ Friedrich Wohler (1800-1882). Wohler was 
one of the founders of biochemistry. By synthe- 
sizing urea, Wohler showed that compounds 
found in living organisms could be made in 
the laboratory from inorganic substances. 

▲ Some of the apparatus used by Louis 
Pasteur in his Paris laboratory. 

▲ Eduard Buchner (1860-1917). Buchner 
was awarded the Nobel Prize in Chemistry in 
1907 “for his biochemical researches and 
his discovery of cell-free fermentation.” 

1.1 Biochemistry Is a Modern Science 

Biochemistry has emerged as an independent science only within the past 100 years but 
the groundwork for the emergence of biochemistry as a modern science was prepared 
in earlier centuries. The period before 1900 saw rapid advances in the understanding of 
basic chemical principles such as reaction kinetics and the atomic composition of mol- 
ecules. Many chemicals produced in living organisms had been identified by the end of 
the 19th century. Since then, biochemistry has become an organized discipline and bio- 
chemists have elucidated many of the chemical processes of life. The growth of bio- 
chemistry and its influence on other disciplines will continue in the 21st century. 

In 1828, Friedrich Wohler synthesized the organic compound urea by heating the 
inorganic compound ammonium cyanate. 


NH 4 (OCN)-^ h 2 n — c — nh 2 

This experiment showed for the first time that compounds found exclusively in living or- 
ganisms could be synthesized from common inorganic substances. Today we understand 
that the synthesis and degradation of biological substances obey the same chemical and 
physical laws as those that predominate outside of biology. No special or “vitalistic” 
processes are required to explain life at the molecular level. Many scientists date the begin- 
nings of biochemistry to Wohlers synthesis of urea, although it would be another 75 years 
before the first biochemistry departments were established at universities. 

Louis Pasteur (1822-1895) is best known as the founder of microbiology and an 
active promoter of germ theory. But Pasteur also made many contributions to biochem- 
istry including the discovery of stereoisomers. 

Two major breakthroughs in the history of biochemistry are especially notable — the 
discovery of the roles of enzymes as catalysts and the role of nucleic acids as informa- 
tion-carrying molecules. The very large size of proteins and nucleic acids made their ini- 
tial characterization difficult using the techniques available in the early part of the 20th 
century. With the development of modern technology we now know a great deal about 
how the structures of proteins and nucleic acids are related to their biological functions. 

The first breakthrough — identification of enzymes as the catalysts of biological re- 
actions — resulted in part from the research of Eduard Buchner. In 1897 Buchner 
showed that extracts of yeast cells could catalyze the fermentation of the sugar glucose 
to alcohol and carbon dioxide. Previously, scientists believed that only living cells could 
catalyze such complex biological reactions. 

The nature of biological catalysts was explored by Buchner’s contemporary, Emil 
Fischer. Fischer studied the catalytic effect of yeast enzymes on the hydrolysis (break- 
down by water) of sucrose (table sugar). He proposed that during catalysis an enzyme 
and its reactant, or substrate, combine to form an intermediate compound. He also pro- 
posed that only a molecule with a suitable structure can serve as a substrate for a given 
enzyme. Fischer described enzymes as rigid templates, or locks, and substrates as 
matching keys. Researchers soon realized that almost all the reactions of life are cat- 
alyzed by enzymes and a modified lock-and-key theory of enzyme action remains a 
central tenet of modern biochemistry. 

Another key property of enzyme catalysis is that biological reactions occur much 
faster than they would without a catalyst. In addition to speeding up the rates of reac- 
tions, enzyme catalysts produce very high yields with few, if any, by-products. In con- 
trast, many catalyzed reactions in organic chemistry are considered acceptable with 
yields of 50% to 60%. Biochemical reactions must be more efficient because by- 
products can be toxic to cells and their formation would waste precious energy. The 
mechanisms of catalysis are described in Chapter 5. 

The last half of the 20th century saw tremendous advances in the area of structural 
biology, especially the structure of proteins. The first protein structures were solved in 
the 1950s and 1960s by scientists at Cambridge University (United Kingdom) led by 

1.2 The Chemical Elements of Life 3 

John C. Kendrew and Max Perutz. Since then, the three-dimensional structures of several 
thousand different proteins have been determined and our understanding of the com- 
plex biochemistry of proteins has increased enormously. These rapid advances were 
made possible by the availability of larger and faster computers and new software that 
could carry out the many calculations that used to be done by hand using simple calcu- 
lators. Much of modern biochemistry relies on computers. 

The second major breakthrough in the history of biochemistry — identification of 
nucleic acids as information molecules — came a half-century after Buchner’s and Fis- 
cher’s experiments. In 1944 Oswald Avery, Colin MacLeod, and Maclyn McCarty ex- 
tracted deoxyribonucleic acid (DNA) from a pathogenic strain of the bacterium 
Streptococcus pneumoniae and mixed the DNA with a nonpathogenic strain of the same 
organism. The nonpathogenic strain was permanently transformed into a pathogenic 
strain. This experiment provided the first conclusive evidence that DNA is the genetic 
material. In 1953 James D. Watson and Francis H. C. Crick deduced the three-dimen- 
sional structure of DNA. The structure of DNA immediately suggested to Watson and 
Crick a method whereby DNA could reproduce itself, or replicate, and thus transmit bi- 
ological information to succeeding generations. Subsequent research showed that infor- 
mation encoded in DNA can be transcribed to ribonucleic acid (RNA) and then trans- 
lated into protein. 

The study of genetics at the level of nucleic acid molecules is part of the discipline 
of molecular biology and molecular biology is part of the discipline of biochemistry. In 
order to understand how nucleic acids store and transmit genetic information, you 
must understand the structure of nucleic acids and their role in information flow. You 
will find that much of your study of biochemistry is devoted to considering how en- 
zymes and nucleic acids are central to the chemistry of life. 

As Crick predicted in 1958, the normal flow of information from nucleic acid to 
protein is not reversible. He referred to this unidirectional information flow from nu- 
cleic acid to protein as the Central Dogma of Molecular Biology. The term “Central 
Dogma” is often misunderstood. Strictly speaking, it does not refer to the overall flow of 
information shown in the figure. Instead, it refers to the fact that once information in 
nucleic acids is transferred to protein it cannot flow backwards from protein to nucleic 





▲ Information flow in molecular biology. The 

flow of information is normally from DNA to 
RNA. Some RNAs (messenger RNAs) are 
translated. Some RNA can be reverse tran- 
scribed back to DNA but according Crick’s 
Central Dogma of Molecular Biology the 
transfer of information from nucleic acid 
(e.g., mRNA) to protein is irreversible. 

1.2 The Chemical Elements of Life 

Six nonmetallic elements — carbon, hydrogen, nitrogen, oxygen, phosphorus, and sul- 
fur — account for more than 97% of the weight of most organisms. All these elements 
can form stable covalent bonds. The relative amounts of these six elements vary among 
organisms. Water is a major component of cells and accounts for the high percentage 
(by weight) of oxygen. Carbon is much more abundant in living organisms than in the 
rest of the universe. On the other hand, some elements, such as silicon, aluminum, and 
iron, are very common in the Earth’s crust but are present only in trace amounts in 
cells. In addition to the standard six elements (CHNOPS), there are 23 other elements 
commonly found in living organisms (Figure 1.1). These include five ions that are essen- 
tial in all species: calcium (Ca©), potassium (K 0 ), sodium (Na 0 ), magnesium (Mg©), 
and chloride (Cl®) Note that the additional 23 elements account for only 3% of the 
weight of living organisms. 

Most of the solid material of cells consists of carbon-containing compounds. The 
study of such compounds falls into the domain of organic chemistry. A course in or- 
ganic chemistry is helpful in understanding biochemistry because there is considerable 
overlap between the two disciplines. Organic chemists are more interested in reactions 
that take place in the laboratory, whereas biochemists would like to understand how re- 
actions occur in living cells. 

Figure 1.2a shows the basic types of organic compounds commonly encountered in 
biochemistry. Make sure you are familiar with these terms because we will be using 
them repeatedly in the rest of this book. 

▲ Emil Fischer (1852-1919). Fischer made 
many contributions to our understanding of 
the structures and functions of biological 
molecules. He received the Nobel Prize in 
Chemistry in 1902 “in recognition of the 
extraordinary services he has rendered by 
his work on sugar and purine synthesis.” 

▲ DNA encodes most of the information 
required in living cells. 

4 CHAPTER 1 Introduction to Biochemistry 

IA 0 




1 1 A 




VI 1 B 





























































































































































































































































































































































▲ Figure 1.1 

Periodic Table of the Elements. The important elements found in living cells are shown in color. The 
red elements (CHNOPS) are the six abundant elements. The five essential ions are purple. The 
trace elements are shown in dark blue (more common) and light blue (less common). 

The synthesis of RNA (transcription) 
and protein (translation) are described 
in Chapters 21 and 22, respectively. 


More than 97% of the weight of most 
organisms is made up of only six 
elements: carbon, hydrogen, nitrogen, 
oxygen, phosphorus, and sulfur 


Living things obey the standard laws of 
physics and chemistry. No “vitalistic” 
force is required to explain life at the 
molecular level. 

Biochemical reactions involve specific chemical bonds or parts of molecules called 
functional groups (Figure 1.2b). We will encounter several common linkages in bio- 
chemistry (Figure 1.2c). Note that all these linkages consist of several different atoms 
and individual bonds between atoms. We will learn more about these compounds, 
functional groups, and linkages throughout this book. Ester and ether linkages are com- 
mon in fatty acids and lipids. Amide linkages are found in proteins. Phosphate ester and 
phosphoanhydride linkages occur in nucleotides. 

An important theme of biochemistry is that the chemical reactions occurring in- 
side cells are the same kinds of reactions that take place in a chemistry laboratory. The 
most important difference is that almost all reactions in living cells are catalyzed by en- 
zymes and thus proceed at very high rates. One of the main goals of this textbook is to 
explain how enzymes speed up reactions without violating the fundamental reaction 
mechanisms of organic chemistry. 

The catalytic efficiency of enzymes can be observed even when the enzymes and re- 
actants are isolated in a test tube. Researchers often find it useful to distinguish between 
biochemical reactions that take place in an organism (in vivo) and those that occur 
under laboratory conditions (in vitro). 

1.3 Many Important Macromolecules Are Polymers 

In addition to numerous small molecules, much of biochemistry deals with very large 
molecules that we refer to as macromolecules. Biological macromolecules are usually a 
form of polymer created by joining many smaller organic molecules, or monomers, via 
condensation (removal of the elements of water). In some cases, such as certain carbo- 
hydrates, a single monomer is repeated many times; in other cases, such as proteins and 
nucleic acids, a variety of different monomers is connected in a particular order. Each 
monomer of a given polymer is added by repeating the same enzyme -catalyzed reaction. 

1.3 Many Important Macromolecules Are Polymers 5 

(a) Organic compounds 


R — OH R— C — H 








R— C — OH 

◄ Figure 1.2 

General formulas of (a) organic compounds, 
(b) functional groups, and (c) linkages com- 
mon in biochemistry. R represents an alkyl 
group (CH 3 (CH 2 ) n ). 




Carboxylic acid 1 

i 1 

i 1 

R— SH 

R — NH 2 

R— NH 

R — N — R 2 





Tertiary ^ 

Amines 2 

(b) Functional groups 

— OH 



— C — R 



— c — 










— SH 


— NH 2 or — NH 3 


— O — P — o 0 






0 ® 




(c) Linkages in biochemical compounds 


— c— o— c— — c— o— c— 

I I I 

Ester Ether 




1 11 © 

— c— o — p— o u 

Phosphate ester 





o — 


1 Under most biological conditions, 
carboxylic acids exist as carboxylate 
anions: O 

R— C— O 0 

2 Under most biological conditions, 
amines exist as ammonium ions: 

Ri Ri 

© ©I ©I 

R — NH 3 , R — NH 2 and R — NH — R 2 

Thus, all of the monomers, or residues, in a macromolecule are aligned in the same di- 
rection and the ends of the macromolecule are chemically distinct. 

Macromolecules have properties that are very different from those of their con- 
stituent monomers. For example, starch is a polymer of the sugar glucose but it is not 
soluble in water and does not taste sweet. Observations such as this have led to the gen- 
eral principle of the hierarchical organization of life. Each new level of organization re- 
sults in properties that cannot be predicted solely from those of the previous level. The 
levels of complexity, in increasing order, are: atoms, molecules, macromolecules, or- 
ganelles, cells, tissues, organs, and whole organisms. (Note that many species lack one or 
more of these levels of complexity. Single-celled organisms, for example, do not have 
tissues and organs.) The following sections briefly describe the principal types of 
macromolecules and how their sequences of residues or three-dimensional shapes grant 
them unique properties. 

6 CHAPTER 1 Introduction to Biochemistry 

The relative molecular mass ( M r ) of a 
molecule is a dimensionless quantity 
referring to the mass of a molecule rel- 
ative to one-twelfth (1/12) the mass of 
an atom of the carbon isotope 12 C. 
Molecular weight (M.W.) is another 
term for relative molecular mass. 

In discussing molecules and macromolecules we will often refer to the molecular 
weight of a compound. A more precise term for molecular weight is relative molecular mass 
(abbreviated M r ). It is the mass of a molecule relative to one-twelfth (1/12) the mass of an 
atom of the carbon isotope 12 C. (The atomic weight of this isotope has been defined as ex- 
actly 12 atomic mass units. Note that the atomic weight of carbon shown in the Periodic 
Table represents the average of several different isotopes, including 13 C and 14 C.) Because 
M r is a relative quantity, it is dimensionless and has no units associated with its value. The 
relative molecular mass of a typical protein, for example, is 38,000 (M r = 38,000). The 
absolute molecular mass of a compound has the same magnitude as the molecular 
weight except that it is expressed in units called daltons (1 dalton = 1 atomic mass unit). 
The molecular mass is also called the molar mass because it represents the mass (meas- 
ured in grams) of 1 mole, or 6.022 X 10 23 molecules. The molecular mass of a typical 
protein is 38,000 daltons, which means that 1 mole weighs 38 kilograms. The main source 
of confusion is that the term “molecular weight” has become common jargon in biochem- 
istry although it refers to relative molecular mass and not to weight. It is a common error 
to give a molecular weight in daltons when it should be dimensionless. In most cases, this 
isn’t a very important mistake but you should know the correct terminology. 


© I 

H 3 N — C — H 


(b) O 

© II o 

H 3 N — CH — C — N — CH — COO u 

I I I 


▲ Figure 1 .3 

Structure of an amino acid and a dipeptide. 

(a) Amino acids contain an amino group 
(blue) and a carboxylate group (red). Differ- 
ent amino acids contain different side chains 
(designated — R). (b) A dipeptide is pro- 
duced when the amino group of one amino 
acid reacts with the carboxylate group of an- 
other to form a peptide bond (red). 


Biochemical molecules are 
three-dimensional objects. 

A. Proteins 

Twenty common amino acids are incorporated into proteins in all cells. Each amino 
acid contains an amino group and a carboxylate group, as well as a side chain (R group) 
that is unique to each amino acid (Figure 1.3a). The amino group of one amino acid 
and the carboxylate group of another are condensed during protein synthesis to form 
an amide linkage, as shown in Figure 1.3b. The bond between the carbon atom of one 
amino acid residue and the nitrogen atom of the next residue is called a peptide bond. 
The end-to-end joining of many amino acids forms a linear polypeptide that may con- 
tain hundreds of amino acid residues. A functional protein can be a single polypeptide 
or it can consist of several distinct polypeptide chains that are tightly bound to form a 
more complex structure. 

Many proteins function as enzymes. Others are structural components of cells and 
organisms. Finear polypeptides fold into a distinct three-dimensional shape. This shape 
is determined largely by the sequence of its amino acid residues. This sequence infor- 
mation is encoded in the gene for the protein. The function of a protein depends on its 
three-dimensional structure, or conformation. 

The structures of many proteins have been determined and several principles gov- 
erning the relationship between structure and function have become clear. For example, 
many enzymes contain a cleft, or groove, that binds the substrates of a reaction. This 
cavity contains the active site of the enzyme — the region where the chemical reaction 
takes place. Figure 1.4a shows the structure of the enzyme lysozyme that catalyzes the 
hydrolysis of specific carbohydrate polymers. Figure 1.4b shows the structure of the en- 
zyme with the substrate bound in the cleft. We will discuss the relationship between 
protein structure and function in Chapters 4 and 6. 

There are many ways of representing the three-dimensional structures of biopoly- 
mers such as proteins. The lysozyme molecule in Figure 1.4 is shown as a cartoon where 
the conformation of the polypeptide chain is represented as a combination of wires, 
helical ribbons, and broad arrows. Other kinds of representations in the following chap- 
ters include images that show the position of every atom. Computer programs that cre- 
ate these images are freely available on the Internet and the structural data for proteins 
can be retrieved from a number of database sites. With a little practice, any student can 
view these molecules on a computer monitor. 

B. Polysaccharides 

Carbohydrates, or saccharides, are composed primarily of carbon, oxygen, and hydro- 
gen. This group of compounds includes simple sugars (monosaccharides) as well as 
their polymers (polysaccharides). All monosaccharides and all residues of polysaccha- 
rides contain several hydroxyl groups and are therefore polyalcohols. The most com- 
mon monosaccharides contain either five or six carbon atoms. 

1.3 Many Important Macromolecules Are Polymers 7 

Sugar structures can be represented in several ways. For example, ribose (the most 
common five-carbon sugar) can be shown as a linear molecule containing four hydroxyl 
groups and one aldehyde group (Figure 1.5a). This linear representation is called a Fis- 
cher projection (after Emil Fischer). In its usual biochemical form, however, the struc- 
ture of ribose is a ring with a covalent bond between the carbon of the aldehyde group 
(C-l) and the oxygen of the C-4 hydroxyl group, as shown in Figure 1.5b. The ring form 
is most commonly shown as a Haworth projection (Figure 1.5c). This representation is a 
more accurate way of depicting the actual structure of ribose. The Haworth projection is 
rotated 90° with respect to the Fischer projection and portrays the carbohydrate ring as a 
plane with one edge projecting out of the page (represented by the thick lines). However, 
the ring is not actually planar. It can adopt numerous conformations in which certain 
ring atoms are out-of-plane. In Figure 1.5d, for example, the C-2 atom of ribose lies 
above the plane formed by the rest of the ring atoms. 

Some conformations are more stable than others so the majority of ribose mole- 
cules can be represented by one or two of the many possible conformations. Neverthe- 
less, it’s important to note that most biochemical molecules exist as a collection of 
structures with different conformations. The change from one conformation to another 
does not require the breaking of any covalent bonds. In contrast, the two basic forms of 
carbohydrate structures, linear and ring forms, do require the breaking and forming of 
covalent bonds. 

Glucose is the most abundant six-carbon sugar (Figure 1.6a on page 8). It is the 
monomeric unit of cellulose, a structural polysaccharide, and of glycogen and starch, 
which are storage polysaccharides. In these polysaccharides, each glucose residue is 
joined covalently to the next by a covalent bond between C-l of one glucose molecule 
and one of the hydroxyl groups of another. This bond is called a glycosidic bond. In cel- 
lulose, C-l of each glucose residue is joined to the C-4 hydroxyl group of the next 
residue (Figure 1.6b). The hydroxyl groups on adjacent chains of cellulose interact non- 
covalently creating strong, insoluble fibers. Cellulose is probably the most abundant 
biopolymer on Earth because it is a major component of flowering plant stems includ- 
ing tree trunks. We will discuss carbohydrates further in Chapter 8. 

C. Nucleic Acids 

Nucleic acids are large macromolecules composed of monomers called nucleotides. The 
term polynucleotide is a more accurate description of a single molecule of nucleic acid, 
just as polypeptide is a more accurate term than protein for single molecules composed 
of amino acid residues. The term nucleic acid refers to the fact that these polynu- 
cleotides were first detected as acidic molecules in the nucleus of eukaryotic cells. We 

▲ Figure 1.4 Chicken [Gallus gallus) eggwhite 
lysozyme, (a) Free lysozyme. Note the char- 
acteristic cleft that includes the active site 
of the enzyme, (b) Lysozyme with bound 
substrate. [PDB 1LZC]. 

The rules for drawing a molecule as a 
Fischer projection are described in 
Section 8.1. 

Conformations of monosaccharides are 
described in more detail in Section 8.3. 

Fischer projection Fischer projection 

(open-chain form) (ring form) 

Haworth projection Envelope conformation 

▲ Figure 1.5 

Representations of the structure of ribose. (a) In the Fischer projection, ribose is drawn as a linear molecule, (b) In its usual biochemical 
form, the ribose molecule is in a ring, shown here as a Fischer projection, (c) In a Haworth projection, the ring is depicted as lying per- 
pendicular to the page (as indicated by the thick lines, which represent the bonds closest to the viewer), (d) The ring of ribose is not 
actually planar but can adopt 20 possible conformations in which certain ring atoms are out-of-plane. In the conformation shown, C-2 lies 
above the plane formed by the rest of the ring atoms. 

8 CHAPTER 1 Introduction to Biochemistry 

Figure 1.6 ► 

Glucose and cellulose, (a) Haworth projection 
of glucose, (b) Cellulose, a linear polymer of 
glucose residues. Each residue is joined to 
the next by a glycosidic bond (red). 

The structures of nucleic acids are 
described in Chapter 19. 



▲ Figure 1.7 

Deoxyribose, the sugar found in deoxyribonu- 
cleotides. Deoxyribose lacks a hydroxyl group 
at C-2. 

The role of ATP in biochemical reac- 
tions is described in Section 10.7. 

Figure 1.8 ► 

Structure of adenosine triphosphate (ATP). The 

nitrogenous base adenine (blue) is attached 
to ribose (black). Three phosphoryl groups 
(red) are also bound to the ribose. 

now know that nucleic acids are not confined to the eukaryotic nucleus but are abun- 
dant in the cytoplasm and in prokaryotes that don’t have a nucleus. 

Nucleotides consist of a five-carbon sugar, a heterocyclic nitrogenous base, and at 
least one phosphate group. In ribonucleotides, the sugar is ribose; in deoxyribonu- 
cleotides, it is the derivative deoxyribose (Figure 1.7). The nitrogenous bases of nu- 
cleotides belong to two families known as purines and pyrimidines. The major purines 
are adenine (A) and guanine (G); the major pyrimidines are cytosine (C), thymine (T), 
and uracil (U). In a nucleotide, the base is joined to C-l of the sugar, and the phosphate 
group is attached to one of the other sugar carbons (usually C-5). 

The structure of the nucleotide adenosine triphosphate (ATP) is shown in Figure 1.8. 
ATP consists of an adenine moiety linked to ribose by a glycosidic bond. There are three 
phosphoryl groups (designated a , /3, and y) esterified to the C-5 hydroxyl group of the ri- 
bose. The linkage between ribose and the a-phosphoryl group is a phosphoester linkage 
because it includes a carbon and a phosphorus atom, whereas the /3 - and y-phosphoryl 
groups in ATP are connected by phosphoanhydride linkages that don’t involve carbon 
atoms (see Figure 1.2). All phosphoanhydrides possess considerable chemical potential 
energy and ATP is no exception. It is the central carrier of energy in living cells. The potential 
energy associated with the hydrolysis of ATP can be used directly in biochemical reactions or 
coupled to a reaction in a less obvious way. 

In polynucleotides, the phosphate group of one nucleotide is covalently linked to 
the C-3 oxygen atom of the sugar of another nucleotide creating a second phosphoester 
linkage. The entire linkage between the carbons of adjacent nucleotides is called a phos- 
phodiester linkage because it contains two phosphoester linkages (Figure 1.9). Nucleic 
acids contain many nucleotide residues and are characterized by a backbone consisting 
of alternating sugars and phosphates. In DNA, the bases of two different polynucleotide 
strands interact to form a helical structure. 

There are several ways of depicting nucleic acid structures depending on which fea- 
tures are being described. The ball-and-stick model shown in Figure 1.10 is ideal for show- 
ing the individual atoms and the ring structure of the sugars and the bases. In this case, the 


1.3 Many Important Macromolecules are Polymers 9 

two helices can be traced by following the sugar-phosphate backbone emphasized by 
the presence of the purple phosphorus atoms surrounded by four red oxygen atoms. 
The individual base pairs are viewed edge-on in the interior of the molecule. We will see 
several other DNA models in Chapter 19. 

RNA contains ribose rather than deoxyribose and it is usually a single-stranded 
polynucleotide. There are four different kinds of RNA molecules. Messenger RNA 
(mRNA) is involved directly in the transfer of information from DNA to protein. Transfer 
RNA (tRNA) is a smaller molecule required for protein synthesis. Ribosomal RNA 
(rRNA) is the major component of ribosomes. Cells also contain a heterogeneous class of 
small RNAs that carry out a variety of different functions. In Chapters 19 to 22, we will see 
how these RNA molecules differ and how their structures reflect their biological roles. 

D. Lipids and Membranes 

The term “lipid” refers to a diverse class of molecules that are rich in carbon and hydro- 
gen but contain relatively few oxygen atoms. Most lipids are not soluble in water but 
they do dissolve in some organic solvents. Lipids often have a polar, hydrophilic (water- 
loving) head and a nonpolar, hydrophobic (water- fearing) tail (Figure 1.11). In an aque- 
ous environment, the hydrophobic tails of such lipids associate while the hydrophobic 
heads are exposed to water, producing a sheet called a lipid bilayer. Lipid bilayers form the 
structural basis of all biological membranes. Membranes separate cells or compartments 
within cells from their environments by acting as barriers that are impermeable to most 
water-soluble compounds. Membranes are flexible because lipid bilayers are stabilized by 
noncovalent forces. 

The simplest lipids are fatty acids — these are long- chain hydrocarbons with a car- 
boxylate group at one end. Fatty acids are commonly found as part of larger molecules 
called glycerophospholipids consisting of glycerol 3 -phosphate and two fatty acyl groups 
(Figure 1.12 on the next page). Glycerophospholipids are major components of biological 

Other kinds of lipids include steroids and waxes. Steroids are molecules like choles- 
terol and many sex hormones. Waxes are common in plants and animals but perhaps 
the most familiar examples are beeswax and the wax that forms in your ears. 

Membranes are among the largest and most complex cellular structures. Strictly 
speaking, membranes are aggregates, not polymers. However, the association of lipid 
molecules with each other creates structures that exhibit properties not shown by indi- 
vidual component molecules. Their insolubility in water and the flexibility of lipid ag- 
gregates give biological membranes many of their characteristics. 

◄ Figure 1.9 

Structure of a dinucleotide. One deoxyribonu- 
cleotide residue contains the pyrimidine 
thymine (top), and the other contains the 
purine adenine (bottom). The residues are 
joined by a phosphodiester linkage between 
the two deoxyribose moieties. (The carbon 
atoms of deoxyribose are numbered with 
primes to distinguish them from the atoms 
of the bases thymine and adenine.) 

▲ Figure 1.10 

Short segment of a DNA molecule. Two differ- 
ent polynucleotides associate to form a 
double helix. The sequence of base pairs 
on the inside of the helix carries genetic 

▲ Figure 1 .1 1 

Model of a membrane lipid. The molecule 
consists of a polar head (blue) and a nonpo- 
lar tail (yellow). 

Hydrophobic interactions are discussed 
in Chapter 2. 

10 CHAPTER 1 Introduction to Biochemistry 

Figure 1.12 ► 

Structures of glycerol 3-phosphate and a glyc- 
erophospholipid. (a) The phosphate group of 
glycerol 3-phosphate is polar, (b) In a glyc- 
erophospholipid, two nonpolar fatty acid 
chains are bound to glycerol 3-phosphate 
through ester linkages. X represents a sub- 
stituent of the phosphate group. 

a) 0° 

O = P — o e 



1 2 3 | 

h 2 c — ch — ch 2 


Glycerol 3-phosphate 

>- acyl 







0 = p- 

1 2 3 1 

h 2 c — ch — ch 2 

0 o 

1 I 

0= c c = o 


Most of the energy required for life is 
supplied by light from the sun. 

Biological membranes also contain proteins as shown in Figure 1.13. Some of these 
membrane proteins serve as channels for the entry of nutrients and the exit of wastes. 
Other proteins catalyze reactions that occur specifically at the membrane surface. They 
are the sites of many important biochemical reactions. We will discuss lipids and bio- 
logical membranes in greater detail in Chapter 9. 

1.4 The Energetics of Life 

The activities of living organisms do not depend solely on the biomolecules described 
in the preceding section and on the multitude of smaller molecules and ions found in 
cells. Life also requires the input of energy. Living organisms are constantly transform- 
ing energy into useful work to sustain themselves, to grow, and to reproduce. Almost all 
this energy is ultimately supplied by the sun. 



▲ Figure 1 .13 

General structure of a biological membrane. Biological membranes consist of a lipid bilayer with as- 
sociated proteins. The hydrophobic tails of individual lipid molecules associate to form the core of 
the membrane. The hydrophilic heads are in contact with the aqueous medium on either side of 
the membrane. Most membrane proteins span the lipid bilayer; others are attached to the mem- 
brane surface in various ways. 

1.4 The Energetics of Life 1 1 

Sunlight is captured by plants, algae, and photosynthetic bacteria and used for the 
synthesis of biological compounds. Photosynthetic organisms can be ingested as food 
and their component molecules used by organisms such as protozoa, fungi, nonphoto- 
synthetic bacteria, and animals. These organisms cannot directly convert sunlight into 
useful biochemical energy. The breakdown of organic compounds in both photosyn- 
thetic and nonphotosynthetic organisms releases energy that can be used to drive the 
synthesis of new molecules and macromolecules. 

Photosynthesis is one of the key biochemical processes that are essential for life, 
even though many species, including animals, benefit only indirectly. One of the by- 
products of photosynthesis is oxygen. It is likely that Earth’s atmosphere was trans- 
formed by oxygen-producing photosynthetic bacteria during the first several billion 
years of its history (a natural example of terraforming). In Chapter 15, we will discuss 
the amazing set of reactions that capture sunlight and use it to synthesize biopolymers. 

The term metabolism describes the myriad reactions in which organic compounds 
are synthesized and degraded and useful energy is extracted, stored, and used. The study 
of the changes in energy during metabolic reactions is called bio energetics. Bioenergetics 
is part of the field of thermodynamics, a branch of physical science that deals with en- 
ergy changes. Biochemists have discovered that the basic thermodynamic principles 
that apply to energy flow in nonliving systems also apply to the chemistry of life. 

Thermodynamics is a complex and highly sophisticated subject but we don’t need 
to master all of its complexities and subtleties in order to understand how it can con- 
tribute to an understanding of biochemistry. We will avoid some of the complications 
of thermodynamics in this book and concentrate instead on using it to describe some 
biochemical principles (discussed in Chapter 10). 

A. Reaction Rates and Equilibria 

The rate, or speed, of a chemical reaction depends on the concentration of the reac- 
tants. Consider a simple chemical reaction where molecule A collides with molecule B 
and undergoes a reaction that produces products C and D. 

A + B > C + D (1.2) 

The rate of this reaction is determined by the concentrations of A and B. At high 
concentrations, these reactants are more likely to collide with each other; at low concen- 
trations, the reaction might take a long time. We indicate the concentration of a reacting 
molecule by enclosing its symbol in square brackets. Thus, [A] means “the concentra- 
tion of A” — usually expressed in moles per liter (M). The rate of the reaction is directly 
proportional to the product of the concentrations of A and B. This rate can be described 
by a proportionality constant, k , that is more commonly called a rate constant. 

rate oc [A][B] rate = /c[A][B] (1.3) 

Almost all biochemical reactions are reversible. This means that C and D can col- 
lide and undergo a chemical reaction to produce A and B. The rate of the reverse reac- 
tion will depend on the concentrations of C and D and that rate can be described by a 
different rate constant. By convention, the forward rate constant is k\ and the reverse 
rate constant is k-\. Reaction 1.4 is a more accurate way of depicting the reaction 
shown in Reaction 1.2. 

A + B C + D (1.4) 


If we begin a test tube reaction by mixing high concentrations of A and B, then the 
initial concentrations of C and D will be zero and the reaction will only proceed from 
left to right. The rate of the initial reaction will depend on the beginning concentrations 
of A and B and the rate constant k\. As the reaction proceeds, the amount of A and B 
will decrease and the amount of C and D will increase. The reverse reaction will start to 
become significant as the products accumulate. The speed of the reverse reaction will 
depend on the concentrations of C and D and the rate constant k-\. 

▲ Sunlight on a tropical rain forest. Plants 
convert sunlight and inorganic nutrients into 
organic compounds. 

Inorganic nutrients 
(C0 2 , H 2 0) 

Light energy 




Organic compounds 


All organisms 

Waste Macromolecules 
(C0 2/ H 2 0) 

▲ Energy flow. Photosynthetic organisms 
capture the energy of sunlight and use it to 
synthesize organic compounds. The break- 
down of these compounds in both photosyn- 
thetic and nonphotosynthetic organisms 
generates energy needed for the synthesis of 
macromolecules and for other cellular re- 

12 CHAPTER 1 Introduction to Biochemistry 


The rate of a chemical reaction depends 
on the concentrations of the reactants. 
The higher the concentration, the faster 
the reaction. 

At some point, the rates of the forward and reverse reactions will be equal and there 
will be no further change in the concentrations of A, B, C, and D. In other words, the re- 
action will have reached equilibrium. At equilibrium, 

*1 [A] [B] = /C—t [C] [D] (1.5) 


Almost all biochemical reactions are 
reversible. When the forward and reverse 
reactions are equal, the reaction is at 

In many cases we are interested in the final concentrations of the reactants and 
products once the reaction has reached equilibrium. The ratio of product concentra- 
tions to reactant concentrations defines the equilibrium constant, K eq . The equilibrium 
constant is also equal to the ratio of the forward and reverse rate constants and since Zq 
and k_i are constants, so is K eq . Rearranging Equation 1.5 gives, 

*1 [C][D] 

k - 1 [A][B] eq 

( 1 . 6 ) 

In theory, the concentrations of products and reactants could be identical once the 
reaction reaches equilibrium. In that case, K eq = 1 and the forward and reverse rate 
constants have the same values. In most cases the value of the equilibrium constant 
ranges from 10 -3 to 10 3 meaning that the rate of one of the reactions is much faster 
than the other. If K eq = 10 3 then the reaction will proceed mostly to the right and the 
final concentrations of C and D will be much higher than the concentrations of A and 
B. In this case, the forward rate constant (Zq) will be 1000 times greater than the reverse 
rate constant (k-i). This means that collisions between C and D are much less likely to 
produce a chemical reaction than collisions between A and B. 

▲ Josiah Willard Gibbs (1839-1903). Gibbs 
was one of the greatest American scientists 
of the 19th century. He founded the modern 
field of chemical thermodynamics. 

B. Thermodynamics 

If we know the energy changes associated with a reaction or process, we can predict the 
equilibrium concentrations. We can also predict the direction of a reaction provided we 
know the initial concentrations of reactants and products. The thermodynamic quan- 
tity that provides this information is the Gibbs free energy (G), named after J. Willard 
Gibbs who first described this quantity in 1878. 

It turns out that molecules in solution have a certain energy that depends on tem- 
perature, pressure, concentration, and other states. The Gibbs free energy change (AG) 
for a reaction is the difference between the free energy of the products and the free en- 
ergy of the reactants. The overall Gibbs free energy change has two components known 
as the enthalpy change (AH, the change in heat content) and the entropy change (AS, 
the change in randomness). A biochemical process may generate heat or absorb it from 
the surroundings. Similarly, a process may occur with an increase or a decrease in the 
degree of disorder, or randomness, of the reactants. 

Starting with an initial solution of reactants and products, if the reaction proceeds 
to produce more products, then AG must be less than zero (AG < 0). In chemistry 
terms, we say that the reaction is spontaneous and energy is released. When AG is 
greater than zero (AG > 0), the reaction requires external energy to proceed and it will 
not yield more products. In fact, more reactants will accumulate as the reverse reaction 
is favored. When AG equals zero (AG = 0), the reaction is at equilibrium; the rates of 
the forward and reverse reactions are identical and the concentrations of the products 
and reactants no longer change. 

We are mostly interested the overall Gibbs free energy change, expressed as 


The Gibbs free energy change (A G) is the 
difference between the free energy of the 
products of a reaction and that of the 
reactants (substrates). 

AG = AH - TAS (1.7) 

where T is the temperature in Kelvin. 

A series of linked processes, such as the reactions of a metabolic pathway in a cell, 
usually proceeds only when associated with an overall negative Gibbs free energy 
change. Biochemical reactions or processes are more likely to occur, both to a greater 
extent and more rapidly, when they are associated with an increase in entropy and a de- 
crease in enthalpy. 

1.4 The Energetics of Life 13 

If we knew the Gibbs free energy of every product and every reactant, it would be a 
simple matter to calculate the Gibbs free energy change for a reaction by using Equation 1.8. 

Abreaction = AGp roc |ucts — ^^reactants ( 1 - 8 ) 

Unfortunately, we don’t often know the absolute Gibbs free energies of every bio- 
chemical molecule. What we do know are the thermodynamic parameters associated 
with the synthesis of these molecules from simple precursors. For example, glucose can 
be formed from water and carbon dioxide. We don’t need to know the absolute values of 
the Gibbs free energy of water and carbon dioxide in order to calculate the amount of 
enthalpy and entropy that are required to bring them together to make glucose. In fact, 
the heat released by the reverse reaction (breakdown of glucose to carbon dioxide and 
water) can be measured using a calorimeter. This gives us a value for the change in en- 
thalpy of synthesis of glucose (AH). The entropy change (A S) for this reaction can also 
be determined. We can use these quantities to determine the Gibbs free energy of the re- 
action. The true Gibbs free energy of formation AfG is the difference between the ab- 
solute free energy of glucose and that of the elements carbon, oxygen and hydrogen. 

There are tables giving these Gibbs free energy values for the formation of most bi- 
ological molecules. They can be used to calculate the Gibbs free energy change for a re- 
action in the same way that we might use absolute values as in Equation 1.9. 

AG react j on = AfGp roc | uc t s — AfG reac t an t s (1.9) 

In this textbook we will often refer to the AfG value as the Gibbs free energy of a 
compound since it can be easily used in calculations as though it were an absolute value. 
It can also be called just “Gibbs energy” by dropping the word “free.” 

There’s an additional complication that hasn’t been mentioned. For any reaction, in- 
cluding the degradation of glucose, the actual free energy change depends on the concen- 
trations of reactants and products. Let’s consider the hypothetical reaction in Equation 1.2. 
If we begin with a certain amount of A and B and none of the products C and D, then it’s 
obvious that the reaction can only go in one direction, at least initially. In thermodynamic 
terms, AG react i on is favorable under these conditions. The higher the concentrations of A 
and B, the more likely the reaction will occur. This is an important point that we will re- 
turn to many times as we learn about biochemistry — the actual Gibbs free energy change 
in a reaction depends on the concentrations of the reactants and products. 

What we need are some standard values of AG that can be adjusted for concentra- 
tion. These standard values are the Gibbs free energy changes measured under certain 
conditions. By convention, the standard conditions are 25°C (298 K), 1 atm standard 
pressure, and 1.0 M concentration of all products and reactants. In most biochemical 
reactions, the concentration of H© is important, and this is indicated by the pH, as will 
be described in the next chapter. The standard condition for biochemistry reactions is 
pH = 7.0, which corresponds to 10 -7 M H© (rather than 1.0 M as for other reactants 
and products). The Gibbs free energy change under these standard conditions is indi- 
cated by the symbol AG°'. 

The actual Gibbs free energy is related to its standard free energy by 

AG a = A C% + R7"ln[A] (1.10) 

where R is the universal gas constant (8.315 kj -1 mol -1 ) and T is the temperature in 
Kelvin. Gibbs free energy is expressed in units of kj mol -1 . (An older unit is kcal mol -1 , 
which equals 4.184 kj mol -1 .) The term RT ln[A] is sometimes given as 2.303 RT 
log [A]. 

C. Equilibrium Constants and Standard Gibbs Free Energy Changes 

For a given reaction, such as that in Reaction 1.2, the actual Gibbs free energy change is 
related to the standard free energy change by 


AG reac tj on — AG° eac tj on + RT In ^ ^ (1.11) 







▲ The heat given off during a reaction can 
be determined by carrying out the reaction 
in a sensitive calorimeter. 

The importance of the relationship 
between A£ and concentration is 
explained in Section 10.5. 


The standard Gibbs free energy change 
(AG°') tells us the direction of a reaction 
when the concentrations of all products 
and reactants are at 1 M concentration. 
These conditions will never occur in 
living cells. Biochemists are only 
interested in actual Gibbs free energy 
changes (A£), which are usually close to 
zero. The standard Gibbs free energy 
change (AG°') tells us the relative 
concentrations of reactants and products 
when the reaction reaches equilibrium. 

14 CHAPTER 1 Introduction to Biochemistry 



M=AG°' + R nn— 
at equilibrium AG°' + RT In /f eq = 0 


The rate of a reaction is not determined 
by the Gibbs free energy change. 

If the reaction has reached equilibrium, the ratio of concentrations in the last term of 
Equation 1.11 is, by definition, the equilibrium constant (K e q ). When the reaction is at 
equilibrium there is no net change in the concentrations of reactants and products, so 
the actual Gibbs free energy change is zero ( AG react i on = 0). This allows us to write an 
equation relating the standard Gibbs free energy change and the equilibrium constant. 
Thus, at equilibrium, 

Abreaction = -RT In /C eq = -2.303 RT log K eq (1.12) 

This important equation relates thermodynamics and reaction equilibria. Note that 
it is the equilibrium constant that is related to the Gibbs free energy change and not the 
individual rate constants described in Equations 1.6 and 1.7. It is the ratio of those indi- 
vidual rate constants that is important and not their absolute values. The forward and 
reverse rates might both be very slow or very fast and still give the same ratio. 

D. Gibbs Free Energy and Reaction Rates 

Thermodynamic considerations can tell us if a reaction is favored but do not tell how 
quickly a reaction will occur. We know, for example, that iron rusts and copper turns 
green, but these reactions may take only a few seconds or many years. That’s because, 
the rate of a reaction depends on other factors, such as the activation energy. 

Activation energies are usually depicted as a hump, or barrier, in diagrams that 
show the progress of a reaction from left to right. In Figure 1.14, we plot the Gibbs free 
energy at different stages of a reaction as it goes from reactants to products. This 
progress is called the reaction coordinate. 

The overall change in free energy (AG) can be negative, as shown on the left, or 
positive, as shown on the right. In either case, there’s an excess of energy required in 
order for the reaction to proceed. The difference between the top of the energy peak and 
the energy of the product or reactant with the highest Gibbs free energy is known as the 
activation energy ( AG$). 

The rate of this reaction depends on the nature of the reaction. Using our example 
from Equation 1.2, if every collision between A and B is effective, then the rate is likely 
to be fast. On the other hand, if the orientation of individual molecules has to be exactly 
right for a reaction to occur then many collisions will be nonproductive and the rate 
will be slower. In addition to orientation, the rate depends on the kinetic energy of the 
individual molecules. At any given temperature some will be moving slowly when they 
collide and they will not have enough energy to react. Others will be moving rapidly 
and will carry a lot of kinetic energy. 

The activation energy is meant to reflect these parameters. It is a measure of the prob- 
ability that a reaction will occur. The activation energy depends on the temperature — it 
is lower at higher temperatures. It also depends on the concentration of reactants — 
at high concentrations there will be more collisions and the rate of the reaction will be 

The important point is that the rate of a reaction is not predictable from the overall 
Gibbs free energy change. Some reactions, such as the oxidation of iron or copper, will 
proceed very slowly because their activation energies are high. 

Figure 1.14 ► 

The progress of a reaction is depicted from left 
(reactants) to right (products). In the first dia- 
gram, the overall Gibbs free energy change 
is negative since the Gibbs free energy of 
the products is lower than that of the reac- 
tants. In order for the reaction to proceed, 
the reactants have to overcome an activation 
energy barrier (A Gt). In the second dia- 
gram, the overall Gibbs free energy change 
for the reaction is positive and the minimum 
activation energy is smaller. This means that 
the reverse reaction will proceed faster than 
the forward reaction. 

Reaction coordinate 

1.5 Biochemistry and Evolution 15 

Most of the reactions that take place inside a cell are very slow in the test tube even 
though they are thermodynamically favored. Inside a cell the rates of the normally slow 
reactions are accelerated by enzymes. The rates of enzyme -catalyzed reactions can be 
10 20 times greater than the rates of the corresponding uncatalyzed reactions. We will 
spend some time describing how enzymes work — it is one of the most fascinating top- 
ics in biochemistry. 

1.5 Biochemistry and Evolution 

A famous geneticist, Theodosius Dobzhansky, once said, “Nothing in biology makes 
sense except in the light of evolution.” This is also true of biochemistry. Biochemists and 
molecular biologists have made major contributions to our understanding of evolution 
at the molecular level and the evidence they have uncovered confirms and extends the 
data from comparative anatomy, population genetics, and paleontology. We’ve come a 
long way from the original evidence of evolution first summarized by Charles Darwin 
in the middle of the 19th century. 

We now have a very reliable outline of the history of life and the relationships of the 
many diverse species in existence today. The first organisms were single cells that we would 
probably classify today as prokaryotes. Prokaryotes, or bacteria, do not have a membrane- 
bounded nucleus. Fossils of primitive bacteria-like organisms have been found in geologi- 
cal formations that are at least 3 billion years old. The modern species of bacteria belong to 
such diverse groups as the cyanobacteria, which are capable of photosynthesis, and the 
thermophiles, which inhabit hostile environments such as thermal hot springs. 

Eukaryotes have cells that possess complex internal architecture, including a promi- 
nent nucleus. In general, eukaryotic cells are more complex and much larger than prokary- 
otic cells. A typical eukaryotic tissue cell has a diameter of about 25 p, m (25,000 nm), 
whereas prokaryotic cells are typically about 1/10 that size. However, evolution has pro- 
duced tremendous diversity and extreme deviations from typical sizes are common. For 
example, some eukaryotic unicellular organisms are large enough to be visible to the 
naked eye and some nerve cells in the spinal columns of vertebrates can be several feet 
long. There are also megabacteria that are larger than most eukaryotic cells. 

All cells on Earth (prokaryotes and eukaryotes) appear to have evolved from a com- 
mon ancestor that existed more than 3 billion years ago. The evidence for common an- 
cestry includes the presence in all living organisms of common biochemical building 
blocks, the same general patterns of metabolism, and a common genetic code (with 
rare, slight variations). We will see many examples of this evidence throughout this 
book. The basic plan of the primitive cell has been elaborated on with spectacular in- 
ventiveness through billions of years of evolution. 

The importance of evolution for a thorough understanding of biochemistry cannot 
be overestimated. We will encounter many pathways and processes that only make sense 

▲ Charles Darwin (1809-1882). Darwin pub- 
lished The Origin of Species in 1859. His 
theory of evolution by natural selection ex- 
plains adaptive evolution. 

◄ Burgess Shale animals. Many transitional 
fossils support the basic history of life that 
has been worked out over the past few cen- 
turies. Pikia, (left) is a primitive chordate 
from the time of the Cambrian explosion 
about 530 million years ago. These primi- 
tive chordates are the ancestors of all mod- 
ern chordates, including humans. On the 
right is Opabinia, a primitive invertebrate. 

16 CHAPTER 1 Introduction to Biochemistry 




Other Proteo- Cyano- positive Cren- Eury- 

bacteria bacteria bacteria bacteria archaeota archaeota Animals Fungi Plants Protists 

◄ Figure 1.15 

The web of life. The two main groups 
of prokaryotes are the Eubacteria 
(green) and the Archaea (red). 
(Adapted from Doolittle (2000).) 

when we appreciate that they have evolved from more primitive precursors. The evidence 
for evolution at the molecular level is preserved in the sequences of the genes and proteins 
that we will study as we learn about biochemistry. In order to fully understand the funda- 
mental principles of biochemistry we will need to examine pathways and processes in a 
variety of different species including bacteria and a host of eukaryotic model organisms 
such as yeast, fruit flies, flowering plants, mice, and humans. The importance of compara- 
tive biochemistry has been recognized for over 100 years but its value has increased enor- 
mously in the last decade with the publication of complete genome sequences. We are 
now able to compare the complete biochemical pathways of many different species. 

The relationship of the earliest forms of life can be determined by comparing the 
sequences of genes and proteins in modern species. The latest evidence shows that the 
early forms of unicellular life exchanged genes frequently giving rise to a complicated 
network of genetic relationships. Eventually, the various lineages of bacteria and archae- 
bacteria emerged, along with primitive eukaryotes. Further evolution of eukaryotes oc- 
curred when they formed a symbiotic union with bacteria, giving rise to mitochondria 
and chloroplasts. 

The new “web of life” view of evolution (Figure 1.15 ) replaces a more traditional view 
that separated prokaryotes into two entirely separate domains called Eubacteria and Ar- 
chaea. That distinction is not supported by the data from hundreds of sequenced genomes 
so we now see prokaryotes as a single large group with many diverse subgroups, some of 
which are shown in the figure. It is also clear that eukaryotes contain many genes that are 
more closely related to the old eubacterial groups as well as a minority of genes that are 
closer to the old achaeal groups. The early history of life seems to be dominated by rampant 
gene exchange between species and this has led to a web of life rather than a tree of life. 

Many students are interested in human biochemistry, particularly those aspects of 
biochemistry that relate to health and disease. That is an exciting part of biochemistry 
but in order to obtain a deep understanding of who we are, we need to know where we 
came from. An evolutionary perspective helps explain why we cant make some vitamins 

1.7 Prokaryotic Cells: Structural Features 


and amino acids and why we have different blood types and different tolerances for 
milk products. Evolution also explains the unique physiology of animals, which have 
adapted to using other organisms as a source of metabolic fuel. 

Every organism is either a single cell or is composed of many cells. Cells exist in a re- 
markable variety of sizes and shapes but they can usually be classified as either eukary- 
otic or prokaryotic, although some taxonomists continue to split prokaryotes into two 
groups: Eubacteria and Archaea. 

A simple cell can be pictured as a droplet of water surrounded by a plasma mem- 
brane. The water droplet contains dissolved and suspended material including proteins, 
polysaccharides, and nucleic acids. The high lipid content of membranes makes them 
flexible and self-sealing. Membranes present impermeable barriers to large molecules and 
charged species. This property of membranes allows for much higher concentrations of 
biomolecules within cells than in the surrounding medium. 

The material enclosed by the plasma membrane of a cell is called the cytoplasm. 
The cytoplasm may contain large macromolecular structures and subcellular mem- 
brane-bound organelles. The aqueous portion of the cytoplasm minus the subcellular 
structures is called the cytosol. Eukaryotic cells contain a nucleus and other internal 
membrane-bound organelles within the cytoplasm. 

Viruses are subcellular infectious particles. They consist of a nucleic acid molecule 
surrounded by a protein coat and, in some cases, a membrane. Virus nucleic acid can 
contain as few as three genes or as many as several hundred. Despite their biological im- 
portance, viruses are not truly cells because they cannot carry out independent meta- 
bolic reactions. They propagate by hijacking the reproductive machinery of a host cell 
and diverting it to the formation of new viruses. In a sense, viruses are genetic parasites. 

There are thousands of different viruses. Those that infect prokaryotic cells are 
usually called bacteriophages, or phages. Much of what we know about biochemistry is 
derived from the study of viruses and bacteriophages and their interaction with the cells 
they infect. For example, introns were first discovered in a human adenovirus like the 
one shown on the first page of this chapter and the detailed mapping of genes was first 
carried out with bacteriophage T4. 

In the following two sections we will explore the structural features of typical 
prokaryotic and eukaryotic cells. 

Prokaryotes are usually single-celled organisms. The best studied of all living organisms 
is the bacterium Escherichia coli (Figure 1.16). This organism has served for half a cen- 
tury as a model biological system and many of the biochemical reactions described later 
in this book were first discovered in E. coli. E. coli is a fairly typical species of bacteria but 
some bacteria are as different from E. coli as we are from diatoms, daffodils and dragonflies. 

1.6 The Cell Is the Basic Unit of Life 

1.7 Prokaryotic Cells: Structural Features 

- Periplasmic space 

" Cell wall 
Outer membrane 

Plasma membrane 

◄ Figure 1 .16 

Escherichia coli. An E. coli cell is about 
0.5 jim in diameter and 1.5 jim long. 
Proteinaceous fibers called flagella rotate to 
propel the cell. The shorter pili aid in sexual 
conjugation and may help E. coli cells 
adhere to surfaces. The periplasmic space is 
an aqueous compartment separating the 
plasma membrane and the outer membrane. 


18 CHAPTER 1 Introduction to Biochemistry 

► Bacteriophage T4. Much of our current un- 
derstanding of biochemistry comes from 
studies of bacterial viruses such as bacterio- 
phage T4. 

▲ Max Delbruck and Salvatore Luria. Max Del- 
bruck (seated) and Salvatore Luria at the 
Cold Spring Harbor Laboratories in 1953. 
Delbruck and Luria founded the “phage 
group,” a group of scientists who worked on 
the genetics and biochemistry of bacteria 
and bacteriophage in the 1940s, 1950s, 
and 1960s. 

Much of this diversity is apparent only at the molecular level. (See Figure 1.15 for the 
names of some major groups of prokaryotes.) 

Prokaryotes have been found in almost every conceivable environment on Earth, 
from hot sulfur springs to beneath the ocean floor to the insides of larger cells. They ac- 
count for a significant amount of the biomass on Earth. 

Prokaryotes share a number of features in spite of their differences. They lack a nu- 
cleus — their DNA is packed in a region of the cytoplasm called the nucleoid region. 
Many bacterial species have only 1000 genes. From a biochemists perspective one of the 
most fascinating things about bacteria is that, although their chromosomes contain a 
relatively small number of genes, they carry out most of the fundamental biochemical 
reactions found in all cells, including our own. Hundreds of bacterial genomes have 
been completely sequenced and it is now possible to begin to define the minimum 
number of enzymes that are consistent with life. 

Most bacteria have no internal membrane compartments, although there are many 
exceptions. The plasma membrane is usually surrounded by a cell wall made of a rigid 
network of covalently linked carbohydrate and peptide chains. This cell wall confers the 
characteristic shape of an individual species of bacteria. Despite its mechanical strength, 
the cell wall is porous. In addition to the cell wall most bacteria, including E. coli , pos- 
sess an outer membrane consisting of lipids, proteins, and lipids linked to polysaccha- 
rides. The space between the inner plasma membrane and the outer membrane is called 
the periplasmic space. It is the major membrane-bound compartment in bacteria and 
plays a crucial role in some important biochemical processes. 

Many bacteria have protein fibers, called pili, on their outer surface. The pili serve 
as attachment sites for cell-cell interactions. Many species have one or more flagella. 
These are long, whip -like structures that can be rotated like the propeller on a boat thus 
driving the bacterium through its aqueous environment. 

The small size of prokaryotes provides a high ratio of surface area to volume. Sim- 
ple diffusion is therefore an adequate means for distributing nutrients throughout the 
cytoplasm. One of the prominent macromolecular structures in the cytoplasm is the ri- 
bosome — a large RNA-protein complex required for protein synthesis. All living cells 
have ribosomes but we will see later that bacterial ribosomes differ from eukaryotic ri- 
bosomes in significant details. 

1.8 Eukaryotic Cells: Structural Features 

Eukaryotes include plants, animals, fungi, and protists. Protists are mostly small, single- 
celled organisms that don’t fit into one of the other classes. Along with bacteria these 
four groups make up the five kingdoms of life according to one popular classification 
scheme. (Older schemes retain the four eukaryotic kingdoms but divide the bacteria 
into Eubacteria and Archaea.) 

As members of the animal kingdom we are mostly aware of other animals. As rela- 
tively large organisms we tend to focus on the large scale. Hence, we know about plants 
and mushrooms but not microscopic species. 

1.8 Eukaryotic Cells: Structural Features 19 

◄ Figure 1.17 

The eukaryotic tree of life. The traditional 
Plantae, Animalia, and Fungi kingdoms are 
branches within the much larger “kingdom” 
of Protists. 

The latest trees of eukaryotes help us understand the diversity of the protist king- 
dom. As shown in Figure 1.17, the animal, plant, and fungal “kingdoms” occupy rela- 
tively small branches on the eukaryotic tree of life. 

Eukaryotic cells are surrounded by a single plasma membrane unlike bacteria, 
which usually have a double membrane. The most obvious feature that distinguishes 
eukaryotes from prokaryotes is the presence of a membrane-bound nucleus in eukary- 
otes. In fact, eukaryotes are defined by the presence of a nucleus (from the Greek: eu -, 
“true” and karuon , “nut” or “kernel .”). 

As mentioned earlier, eukaryotic cells are almost always larger than bacterial cells, 
commonly 1000-fold greater in volume. Because of their large size complex internal 
structures and mechanisms are required for rapid transport and communication both 
inside the cell and to and from the external medium. A mesh of protein fibers called the 
cytoskeleton extends throughout the cell contributing to cell shape and to the manage- 
ment of intracellular traffic. 

Almost all eukaryotic cells contain additional internal membrane-bound compart- 
ments called organelles. The specific functions of organelles are often closely tied to their 
physical properties and structures. Nevertheless, a significant number of specific biochemi- 
cal processes occur in the cytosol and the cytosol, like organelles, is highly organized. 

The interior of a eukaryotic cell contains an intracellular membrane network. In- 
dependent organelles, including the nucleus, mitochondria, and chloroplasts, are em- 
bedded in this membrane system that pervades the entire cell. Materials flow within 
paths defined by membrane walls and tubules. The intracellular traffic of materials be- 
tween compartments is rapid, highly selective, and closely regulated. 

Figure 1.18 on the next page shows typical animal and plant cells. Both types have a 
nucleus, mitochondria, and a cytoskeleton. Plant cells also contain chloroplasts and vac- 
uoles and are often surrounded by a rigid cell wall. Chloroplasts, also found in algae and 
some other protists, are the sites of photosynthesis. Plant cell walls are mostly composed 
of cellulose, one of the polysaccharides described in Section 1.3B. 

Most multicellular eukaryotes contain tissues. Groups of similarly specialized cells 
within tissues are surrounded by an extracellular matrix containing proteins and poly- 
saccharides. The matrix physically supports the tissue and in some cases directs cell 
growth and movement. 


Animals are a relatively small, highly 
specialized, branch on the tree of life. 

20 CHAPTER 1 Introduction to Biochemistry 


















Cell wall 







▲ Figure 1.18 

Eukaryotic cells, (a) Composite animal cell. Animal cells are typical eukaryotic cells containing or- 
ganelles and structures also found in protists, fungi, and plants, (b) Composite plant cell. Most 
plant cells contain chloroplasts, the sites of photosynthesis in plants and algae; vacuoles, large, 
fluid-filled organelles containing solutes and cellular wastes; and rigid cell walls composed mostly 
of cellulose. 

A. The Nucleus 

The nucleus is usually the most obvious structure in a eukaryotic cell. It is structurally de- 
fined by the nuclear envelope, a membrane with two layers that join at protein-lined nu- 
clear pores. The nuclear envelope is connected to the endoplasmic reticulum (see below). 
The nucleus is the control center of the cell containing 95% of its DNA, which is tightly 
packed with positively charged proteins called histones and coiled into a dense mass called 
chromatin. Replication of DNA and transcription of DNA into RNA occur in the nucleus. 
Many eukaryotes have a dense mass in the nucleus called the nucleolus. The nucleolus is a 
major site of RNA synthesis and the site of assembly of ribosomes. 

Most eukaryotes contain far more DNA than do prokaryotes. Whereas the genetic 
material, or genome, of prokaryotes is usually a single circular molecule of DNA, the eu- 
karyotic genome is organized as multiple linear chromosomes. In eukaryotes new DNA 
and histones are synthesized in preparation for cell division and the chromosomal mate- 
rial condenses and separates into two identical sets of chromosomes. This process is 
called mitosis (Figure 1.19). The cell is then pinched in two to complete cell division. 

Most eukaryotes are diploid — they contain two complete sets of chromosomes. 
From time to time eukaryotic cells undergo meiosis resulting in the production of four 
haploid cells each with a single set of chromosomes. Two haploid cells — eggs and 
sperm, for example — can then fuse to regenerate a typical diploid cell. This process is 
one of the key features of sexual reproduction in eukaryotes. 

B. The Endoplasmic Reticulum and Golgi Apparatus 

A network of membrane sheets and tubules called the endoplasmic reticulum (ER) ex- 
tends from the outer membrane of the nucleus. The aqueous region enclosed within the 
endoplasmic reticulum is called the lumen. In many cells part of the surface of the 
endoplasmic reticulum is coated with ribosomes that are actively synthesizing proteins. 

◄ Figure 1.19 

Mitosis. The five stages of mitosis are shown. Chromosomes (red) condense and line up in the center 
of the cell. Spindle fibers (green) are responsible for separating the recently duplicated chromosomes. 











As synthesis continues the protein is translocated through the membrane into the 
lumen. Proteins destined for export from the cell are completely extruded through the 
membrane into the lumen where they are packaged in membranous vesicles. These 
vesicles travel through the cell and fuse with the plasma membrane releasing their con- 
tents into the extracellular space. The synthesis of proteins destined to remain in the cy- 
tosol occurs at ribosomes that are not bound to the endoplasmic reticulum. 

A complex of flattened, fluid- filled, membranous sacs called the Golgi apparatus is 
often found close to the endoplasmic reticulum and the nucleus. Vesicles that bud off 
from the endoplasmic reticulum fuse with the Golgi apparatus. The proteins carried by 
the vesicles may be chemically modified as they pass through the layers of the Golgi ap- 
paratus. The modified proteins are then sorted, packaged in new vesicles, and trans- 
ported to specific destinations inside or outside the cell. The Golgi apparatus was discov- 
ered by Camillo Golgi in the 19th century (Nobel Laureate, 1906), although it wasn’t 
until many decades later that its role in protein secretion was established. 

C. Mitochondria and Chloroplasts 

Mitochondria and chloroplasts have central roles in energy transduction. Mitochondria 
are the main sites of oxidative energy metabolism. They are found in almost all eukary- 
otic cells. Chloroplasts are the sites of photosynthesis in plants and algae. 

The mitochondrion has an inner and an outer membrane. The inner membrane is 
highly folded, resulting in a surface area three to five times that of the outer membrane. 
It is impermeable to ions and most metabolites. The aqueous phase enclosed by the 
inner membrane is called the mitochondrial matrix. Many of the enzymes involved in 
aerobic energy metabolism are found in the inner membrane and the matrix. 

Mitochondria come in many sizes and shapes. The standard jellybean-shaped mi- 
tochondrion shown here is found in many cell types but some mitochondria are spher- 
ical or have irregular shapes. 

The most important role of the mitochondrion is to oxidize organic acids, fatty 
acids, and amino acids to carbon dioxide and water. Much of the released energy is con- 
served in the form of a proton concentration gradient across the inner mitochondrial 
membrane. This stored energy is used to drive the conversion of adenosine diphosphate 
(ADP) and inorganic phosphate (Pj) to the energy-rich molecule ATP in a phosphory- 
lation process that will be described in detail in Chapter 14. ATP is then used by the cell 
for such energy- requiring processes as biosynthesis, transport of certain molecules and 
ions against concentration and charge gradients, and generation of mechanical force for 
such purposes as locomotion and muscle contraction. The number of mitochondria 
found in cells varies widely. Some eukaryotic cells contain only a few mitochondria 
whereas others have thousands. 

.8 Eukaryotic Cells: Structural Features 21 

◄ Nuclear envelope and endoplasmic reticu- 
lum (ER) of a eukaryotic cell. 

Protein synthesis, sorting, and 
secretion are described in Chapter 22. 

▲ Golgi apparatus. The Golgi apparatus is re- 
sponsible for the modification and sorting of 
proteins that have been transported to the 
Golgi apparatus by vesicles from the ER. 
Vesicles budding off the Golgi apparatus 
carry modified material to destinations in- 
side and outside the cell. 

Outer membrane 


▲ Mitochondrion. Mitochondria are the main 
sites of energy transduction in aerobic eu- 
karyotic cells. Carbohydrates, fatty acids, 
and amino acids are metabolized in this 

22 CHAPTER 1 Introduction to Biochemistry 

► Chloroplast. Chloroplasts are the sites of 
photosynthesis in plants and algae. Light 
energy is captured by pigments associated 
with the thylakoid membrane and used to 
convert carbon dioxide and water to carbo- 



Granum membrane 

▲ Micrographs of fluorescently labeled actin 
filaments and microtubules in mammalian 
cells. (Left) Actin filaments in rat muscle 
cells. (Right) Microtubules in human en- 
dothelial cells. 

Photosynthetic plant cells contain chloroplasts as well as mitochondria. Like mito- 
chondria, chloroplasts have an outer membrane and a complex, highly folded, inner 
membrane called the thylakoid membrane. Part of the inner membrane forms flattened 
sacs called grana (singular, granum). The thylakoid membrane, which is suspended in 
the aqueous stroma, contains chlorophyll and other pigments involved in the capture of 
light energy. Ribosomes and several circular DNA molecules are also suspended in the 
stroma. In chloroplasts the energy captured from light is used to drive the formation of 
carbohydrates from carbon dioxide and water. 

Mitochondria and chloroplasts are derived from bacteria that entered into internal 
symbiotic relationships with primitive eukaryotic cells more than 1 billion years ago. 
Evidence for the endosymbiotic ( endo -, “within”) origin of mitochondria and chloro- 
plasts includes the presence within these organelles of separate, small genomes and spe- 
cific ribosomes that resemble those of bacteria. In recent years scientists have compared 
the sequences of mitochondrial and chloroplast genes (and proteins) with those of 
many species of bacteria. These studies in molecular evolution have shown that mito- 
chondria are derived from primitive members of a particular group of bacteria called 
proteobacteria. Chloroplasts are descended from a distantly related class of photosyn- 
thetic bacteria called cyanobacteria. 

D. Specialized Vesicles 

Eukaryotic cells contain specialized digestive vesicles called lysosomes. These vesicles 
are surrounded by a single membrane that encloses a highly acidic interior. The acidity 
is maintained by proton pumps embedded in the membrane. Lysosomes contain a vari- 
ety of enzymes that catalyze the breakdown of cellular macromolecules such as proteins 
and nucleic acids. They can also digest large particles such as retired mitochondria and 
bacteria ingested by the cell. Lysosomal enzymes are much less active at the near- neutral 
pH of the cytosol than they are under the acidic conditions inside the lysosome. The 
compartmentalization of lysosomal enzymes keeps them from accidentally catalyzing 
the degradation of macromolecules in the cytosol. 

Peroxisomes are present in all animal cells and many plant cells. Like lysosomes, 
they are surrounded by a single membrane. Peroxisomes carry out oxidation reactions, 
some of which produce the toxic compound hydrogen peroxide, (H 2 0 2 ). Some hydro- 
gen peroxide is used for the oxidation of other compounds. Excess hydrogen peroxide is 
destroyed by the action of the peroxisomal enzyme catalase, which catalyzes the conver- 
sion of hydrogen peroxide to water and oxygen. 

Vacuoles are fluid-filled vesicles surrounded by a single membrane. They are com- 
mon in mature plant cells and some protists. These vesicles are storage sites for water, 
ions, and nutrients such as glucose. Some vacuoles contain metabolic waste products 
and some contain enzymes that can catalyze the degradation of macromolecules no 
longer needed by the plant. 

1.9 A Picture of the Living Cell 23 

E. The Cytoskeleton 

The cytoskeleton is a protein scaffold required for support, internal organization, and even 
movement of the cell. Some types of animal cells contain a dense cytoskeleton but it is 
much less prominent in most other eukaryotic cells. The cytoskeleton consists of three 
types of protein filaments: actin filaments, microtubules, and intermediate filaments. All 
three types are built of individual protein molecules that combine to form threadlike fibers. 

Actin filaments (also called micro filaments) are the most abundant cytoskeletal 
component. They are composed of a protein called actin that forms ropelike threads 
with a diameter of about 7 nm. Actin has been found in all eukaryotic cells and is fre- 
quently the most abundant protein in the cell. It is also one of the most evolutionarily 
conserved proteins. This is evidence that actin filaments were present in the ancestral 
eukaryotic cell from which all modern eukaryotes are descended. 

Microtubules are strong, rigid fibers frequently packed in bundles. They have a di- 
ameter of about 22 nm — much thicker than actin filaments. Microtubules are com- 
posed of a protein called tubulin. Microtubules serve as a kind of internal skeleton in 
the cytoplasm, but they also form the mitotic spindle during mitosis. In addition, mi- 
crotubules can form structures capable of directed movement, such as cilia. The flagella 
that propel sperm cells are an example of very long cilia — they are not related to bacter- 
ial flagella. The waving motion of cilia is driven by energy from ATP. 

Intermediate filaments are found in the cytoplasm of most eukaryotic cells. These 
filaments have diameters of approximately 10 nm, which makes them intermediate in 
size compared to actin filaments and microtubules. Intermediate filaments line the in- 
side of the nuclear envelope and extend outward from the nucleus to the periphery of 
the cell. They help the cell resist external mechanical stresses. 

1.9 A Picture of the Living Cell 

We have now introduced the major structures found within cells and described their 
roles. These structures are immense compared to the molecules and polymers that will 
be our focus for the rest of this book. Cells contain thousands of different metabolites 
and many millions of molecules. In the cytosol of every cell there are hundreds of dif- 
ferent enzymes, each acting specifically on only one or possibly a few related metabo- 
lites. There may be 100,000 copies of some enzymes per cell but only a few copies of 
other enzymes. Each enzyme is bombarded with potential substrates. 

Molecular biologist and artist David S. Goodsell has produced captivating images 
showing the molecular contents of an E. coli cell magnified 1 million times (Figure 1.20 on 
page 26). Approximately 600 cubes of this size represent the volume of the E. coli cell. At 
this scale individual atoms are smaller than the dot in the letter i and small metabolites 
are barely visible. Proteins are the size of a grain of rice. 

A drawing of the molecules in a cell shows how densely packed the cytoplasm can be, 
but it cannot give a sense of activity at the atomic scale. All the molecules in a cell are moving 
and colliding with each other. The collisions between molecules are fully elastic — the energy 
of a collision is conserved in the energy of the rebound. As molecules bounce off each other 
they travel a wildly crooked path in space, called the random walk of diffusion. For a small 
molecule such as water, the mean distance traveled between collisions is less than the dimen- 
sions of the molecule and the path includes many reversals of direction. Despite its convo- 
luted path, a water molecule can diffuse the length of an E. coli cell in 1/10 second. 

An enzyme and a small molecule will collide 1 million times per second. Under 
these conditions, a rate of catalysis typical of many enzymes could be achieved even if 
only 1 in about 1000 collisions results in a reaction. Nevertheless, some enzymes cat- 
alyze reactions with an efficiency far greater than 1 reaction per 1000 collisions. In fact, 
a few enzymes catalyze reactions with almost every molecule of substrate their active 
sites encounter — an example of the astounding potency of enzyme- directed chemistry. 
The study of the reaction rates of enzymes, or enzyme kinetics, is one of the most fun- 
damental aspects of biochemistry. It will be covered in Chapter 6. 

Fipids in membranes also diffuse vigorously, though only within the two-dimen- 
sional plane of the lipid bilayer. Fipid molecules exchange places with neighboring 

▲ Actin. Actin filament showing the organi- 
zation in individual subunits of the protein 
actin. (Courtesy David S. Goodsell) 


= 200 nm 



100,000 nm 
(100 urn) 

25 nm 

2000 nm 



50 nm 

5500 nm 

Flagellum — 

15 nm diameter 
10,000 nm long 


= 4 nm 



25 nm 



6.0 nm 



1.5 nm 

0.4 nm 

2.4 nm 

0.8 nm 

6.4 nm 


1.5 nm 

26 CHAPTER 1 Introduction to Biochemistry 

▲ Figure 1.20 

Portion of the cytosol of an E. coli cell. The 

top illustration, in which the contents are 
magnified 1 million times, represents a win- 
dow 100 x 100 nm. Proteins are in shades 
of blue and green. Nucleic acids are in 
shades of pink. The large structures are ri- 
bosomes. Water and small metabolites are 
not shown. The contents in the round inset 
are magnified 10 million times, showing 
water and other small molecules. 

molecules in membranes about 6 million times per second. Some membrane proteins 
can also diffuse rapidly within the membrane. 

Large molecules diffuse more slowly than small ones. In eukaryotic cells the diffu- 
sion of large molecules such as enzymes is retarded even further by the complex net- 
work of the cytoskeleton. Large molecules diffuse across a given distance as much as 
10 times more slowly in the cytosol than in pure water. 

The full extent of cytosolic organization is not yet known. A number of proteins 
and enzymes form large complexes that carry out a series of reactions. We will en- 
counter several such complexes in our study of metabolism. They are often referred to 
as protein machines. This arrangement has the advantage that metabolites pass directly 
from one enzyme to the next without diffusing away into the cytosol. Many researchers 
are sympathetic to the idea that the cytosol is not merely a random mixture of soluble 
molecules but is highly organized in contrast to the long-held impression that simple 
solution chemistry governs cytosolic activity. The concept of a highly organized cytosol 
is a relatively new idea in biochemistry. It may lead to important new insights about 
how cells work at the molecular level. 

1.10 Biochemistry Is Multidisciplinary 

One of the goals of biochemists is to integrate a large body of knowledge into a molecu- 
lar explanation of life. This has been, and continues to be, a challenging task but, in spite 
of the challenges, biochemists have made a great deal of progress toward defining and 
understanding the basic reactions common to all cells. 

The discipline of biochemistry does not exist in a vacuum. We have already seen 
how physics, chemistry, cell biology, and evolution contribute to an understanding of 
biochemistry. Related disciplines, such as physiology and genetics, are also important. 
In fact, many scientists no longer consider themselves to be just biochemists but are also 
knowledgeable in several related fields. 

Because all aspects of biochemistry are interrelated it is difficult to present one 
topic without referring to others. For example, function is intimately related to struc- 
ture and the regulation of individual enzyme activities can be appreciated only in the 
context of a series of linked reactions. The interrelationship of biochemistry topics is a 
problem for both students and teachers in an introductory biochemistry course. The 
material must be presented in a logical and sequential manner but there is no universal 
sequence of topics that suits every course, or every student. Fortunately, there is general 
agreement on the broad outline of an approach to understanding the basic principles of 
biochemistry and this textbook follows that outline. We begin with an introductory 
chapter on water. We will then describe the structures and functions of proteins and en- 
zymes, carbohydrates, and lipids. The third part of the book makes use of structural in- 
formation to describe metabolism and its regulation. Finally, we will examine nucleic 
acids and the storage and transmission of biological information. 

Some courses may cover the material in a slightly different order. For example, the 
structures of nucleic acids can be described before the metabolism section. Wherever 
possible, we have tried to write chapters so that they can be covered in different orders 
in a course depending on the particular needs and interests of the students. 

Appendix The Special Terminology of Biochemistry 

Most biochemical quantities are specified using Systeme International (SI) units. Some 
common SI units are listed in Table 1.1 Many biochemists still use more traditional 
units, although these are rapidly disappearing from the scientific literature. For exam- 
ple, protein chemists sometimes use the angstrom (A) to report interatomic distances; 
1 A is equal to 0.1 nm, the preferred SI unit. Calories (cal) are sometimes used instead of 
joules (J); 1 cal is equal to 4.184 J. 

The standard SI unit of temperature is the Kelvin, but temperature is most com- 
monly reported in degrees Celsius (°C). One degree Celsius is equal in magnitude to 
1 Kelvin, but the Celsius scale begins at the freezing point of water (0°C) and 100°C is 

Selected Readings 27 

TABLE 1.1 SI units commonly used in 



SI unit 












liter 0 





Electric potential 







Kelvin* 3 


°1 liter = 1 0OO cubic centimeters. 
b 273 K = 0° C. 

Table 1.2 Prefixes commonly used with 
SI units 







10 9 



10 6 



10 3 



10- 1 



10“ 2 



10“ 3 



10“ 6 



10“ 9 



icr 12 



io- 15 

the boiling point of water at 1 atm. This scale is often referred to as the centigrade scale 
( centi - = 1/100). Absolute zero is —273 °C, which is equal to 0 K. In warm-blooded 
mammals biochemical reactions occur at body temperature (37°C in humans). 

Very large or very small numerical values for some SI units can be indicated by 
an appropriate prefix. The commonly used prefixes and their symbols are listed in 
Table 1.2. In addition to the standard SI units employed in all fields, biochemistry has 
its own special terminology; for example, biochemists use convenient abbreviations for 
biochemicals that have long names. 

The terms RNA and DNA are good examples. They are shorthand versions of the 
long names ribonucleic acid and deoxyribonucleic acid. Abbreviations such as these are 
very convenient, and learning to associate them with their corresponding chemical 
structures is a necessary step in mastering biochemistry. In this book, we will describe 
common abbreviations as each new class of compounds is introduced. 

Selected Readings 


Bruice, R Y. (2011). Organic Chemistry , 6th ed. 
(Upper Saddle River, NJ: Prentice Hall). 

Tinoco, I., Sauer, K., Wang, J. C., and Puglisi, J. D. 
(2002). Physical Chemistry: Principles and Applica- 
tions in Biological Sciences , 4th ed. (Upper Saddle 
River, NJ: Prentice Hall). 

van Holde, K. E., Johnson, W. C., and Ho, P.S. 
(2005). Principles of Physical Biochemistry 2nd ed. 
(Upper Saddle River, NJ: Prentice Hall). 


Alberts, B., Bray, D., Hopkin, K., Johnson, A., Lewis, 
J., Raff, M., Roberts, K., and Walter, P. (2004). 
Essential Cell Biology (New York: Garland). 

Lodish, H., Berk, A., Matsudaira, P., Kaiser, 

C. A., Kreiger, M., Scott, M. P., Zipursky, L., 
and Darnell, J. (2003). Molecular Cell 
Biology , 5th ed. (New York: Scientific 
American Books). 

Goodsell, D. S. (1993). The Machinery of Life (New 
York: Springer- Verlag). 

Evolution and the Diversity of Life 

Doolittle, W. F. (2000). Uprooting the tree of life. 
Sci. Am. 282(2):90-95. 

Doolittle, W. F. (2009). Eradicating topological 
thinking in prokaryotic systematics and evolution. 
Cold Spr. Hbr. Symp. Quant. Biol. 

Margulis, L., and Schwartz, K.V. (1998). 

Five Kingdoms , 3rd ed. (New York: W.H. 


Graur, D., and Li, W.-H. (2000). Fundamentals 
of Molecular Evolution (Sunderland, MA: 


Sapp, J. (Ed.) (2005). Microbial Phylogeny and Evo- 
lution: Concepts and Controversies. (Oxford, UK: 
Oxford University Press). 

Sapp, J. (2009) The New Foundations of Evolution. 
(Oxford, UK: Oxford University Press). 

History of Science 

Kohler, R. E. (1975). The History of Biochemistry, 
a Survey. /. Hist. Biol 8:275-318. 

o o 


L ife on Earth is often described as a carbon-based phenomenon but it would be 
equally correct to refer to it as a water-based phenomenon. Life probably orig- 
inated in water more than three billion years ago and all living cells still de- 
pend on water for their existence. Water is the most abundant molecule in most cells 
accounting for 60% to 90% of the mass of the cell. The exceptions are cells from which 
water is expelled such as those in seeds and spores. Seeds and spores can lie dormant 
for long periods of time until they are revived by the reintroduction of water. 

Life spread from the oceans to the continents about 500 million years ago. This 
major transition in the history of life required special adaptations to enable terrestrial 
life to survive in an environment where water was less plentiful. You will encounter 
many of these adaptations in the rest of this book. 

An understanding of water and its properties is important to the study of biochem- 
istry. The macromolecular components of cells — proteins, polysaccharides, nucleic 
acids, and lipids — assume their characteristic shapes in response to water. Lor example, 
some types of molecules interact extensively with water and, as a result, are very soluble 
while other molecules do not dissolve easily in water and tend to associate with each 
other in order to avoid water. Much of the metabolic machinery of cells has to operate 
in an aqueous environment because water is an essential solvent. 

We begin our detailed study of the chemistry of life by examining the properties 
of water. The physical properties of water allow it to act as a solvent for ionic and 
other polar substances, and the chemical properties of water allow it to form weak 
bonds with other compounds, including other water molecules. The chemical proper- 
ties of water are also related to the functions of macromolecules, entire cells, and or- 
ganisms. These interactions are important sources of structural stability in macro- 
molecules and large cellular structures. We will see how water affects the interactions 
of substances that have low solubility in water. We will examine the ionization of 
water and discuss acid-base chemistry — topics that are the foundation for under- 
standing the molecules and processes that we will encounter in subsequent chapters. 
It’s important to keep in mind that water is not just an inert solvent; it is also a sub- 
strate for many cellular reactions. 

There is nothing softer and weaker 
than water, And yet there is nothing 
better for attacking hard and strong 
things. For this reason there is no 
substitute for it. 

—Lao-Tzu (c. 550 BCE) 

▲ Eureka Dunes evening primrose ( Oenothera 
californica ) This species only grows in the 
sand dunes of Death Valley National Park in 
California. It has evolved special mecha- 
nisms for conserving water. 

Top: Earth from space. The earth is a watery planet and water plays a central role in the chemistry of all life. 


2.1 The Water Molecule Is Polar 


2.1 The Water Molecule Is Polar 

A water molecule (H 2 0) is V-shaped (Figure 2.1a) and the angle between the two co- 
valent (O — H) bonds is 104.5°. Some important properties of water arise from its 
angled shape and the intermolecular bonds that it can form. An oxygen atom has 
eight electrons and its nucleus has eight protons and eight neutrons. There are two 
electrons in the inner shell and six electrons in the outer shell. The outer shell can 
potentially accommodate four pairs of electrons in one s orbital and three p orbitals. 
However, the structure of water and its properties can be better explained by assum- 
ing that the electrons in the outer shell occupy four sp 3 hybrid orbitals. Think of 
these four orbitals as occupying the four corners of a tetrahedron that surrounds the 
central atom of oxygen. Two of the sp 3 hybrid orbitals contain a pair of electrons and 
the other two each contain a single electron. This means that oxygen can form cova- 
lent bonds with other atoms by sharing electrons to fill these single electron orbitals. 
In water the covalent bonds involve two different hydrogen atoms each of which 
shares its single electron with the oxygen atom. In Figure 2.1b each electron is indi- 
cated by a blue dot showing that each sp 3 hybrid orbital of the oxygen atom is occu- 
pied by two electrons including those shared with the hydrogen atoms. The inner 
shell of the hydrogen atom is also filled because of these two shared electrons in the 
covalent bond. 

The H — O — H bond angle in free water molecules is 104.5° but if the electron or- 
bitals were really pointing to the four corners of a tetrahedron, the angle would be 
109.5°. The usual explanation for this difference is that there is strong repulsion be- 
tween the lone electron pairs and this repulsion pushes the covalent bond orbitals closer 
together, reducing the angle from 109.5° to 104.5°. 

Oxygen atoms are more electronegative than hydrogen atoms because an oxygen 
nucleus attracts electrons more strongly than the single proton in the hydrogen nucleus. 
As a result, an uneven distribution of charge occurs within each O — H bond of the 
water molecule with oxygen bearing a partial negative charge (8®) and hydrogen bear- 
ing a partial positive charge (8®). This uneven distribution of charge within a bond is 
known as a dipole and the bond is said to be polar. 

The polarity of a molecule depends both on the polarity of its covalent bonds and 
its geometry. The angled arrangement of the polar O — H bonds of water creates a per- 
manent dipole for the molecule as a whole as shown in Figure 2.2a. A molecule of am- 
monia also contains a permanent dipole (Figure 2.2b) Thus, even though water and 
gaseous ammonia are electrically neutral, both molecules are polar. The high solubility 
of the polar ammonia molecules in water is facilitated by strong interactions with the 
polar water molecules. The solubility of ammonia in water demonstrates the principle 
that “like dissolves like.” 

Not all molecules are polar; for example, carbon dioxide also contains polar cova- 
lent bonds but the bonds are aligned with each other and oppositely oriented so the po- 
larities cancel each other (Figure 2.2c). As a result, carbon dioxide has no net dipole and 
is much less soluble in water than ammonia. 


2 8° 


H H 




Bond polarities 



i-r ti 

5® H 5 




Bond polarities 

8° 2 5 ® 8° 

0 < =C = 0 
Bond polarities 


Net dipole 

Net dipole 

0 = C=0 
No net dipole 


O Hydrogen 
9 Oxygen 


a Figure 2.1 A water molecule, (a) Space- 
filling structure of a water molecule. 

(b) Angle between the covalent bonds of a 
water molecule. Two of the sp 3 hybrid 
orbitals of the oxygen atom participate in 
covalent bonds with s orbitals of hydrogen 
atoms. The other two sp 3 orbitals are 
occupied by lone pairs of electrons. 


Polar molecules are molecules with an 
unequal distribution of charge so that one 
end of the molecules is more negative 
and another end is more positive. 

◄ Figure 2.2 

Polarity of small molecules, (a) The geometry 
of the polar covalent bonds of water creates 
a permanent dipole for the molecule with 
the oxygen bearing a partial negative charge 
(symbolized by 28®) and each hydrogen 
bearing a partial positive charge (symbolized 
by 8®). (b) The pyramidal shape of a mole- 
cule of ammonia also creates a permanent 
dipole, (c) The polarities of the collinear 
bonds in carbon dioxide cancel each other. 
Therefore, C0 2 is not polar. (Arrows depict- 
ing dipoles point toward the negative charge 
with a cross at the positive end.) 

30 CHAPTER 2 Water 


Hydrogen bonds form when a hydrogen 
atom with a partially positive charge (5®) 
is shared between two electronegative 
atoms (25®). Hydrogen bonds are much 
weaker than covalent bonds. 

2.2 Hydrogen Bonding in Water 

One of the important consequences of the polarity of the water molecule is that water 
molecules attract one another. The attraction between one of the slightly positive hy- 
drogen atoms of one water molecule and the slightly negative electron pairs in one of 
the sp 3 hybrid orbitals produces a hydrogen bond (Figure 2.3). In a hydrogen bond 
between two water molecules the hydrogen atom remains covalently bonded to its oxy- 
gen atom, the hydrogen donor. At the same time, it is attracted to another oxygen atom, 
called the hydrogen acceptor. In effect, the hydrogen atom is being shared (unequally) 
between the two oxygen atoms. The distance from the hydrogen atom to the acceptor 
oxygen atom is about twice the length of the covalent bond. 

Water is not the only molecule capable of forming hydrogen bonds; these interac- 
tions can occur between any electronegative atom and a hydrogen atom attached to an- 
other electronegative atom. (We will examine other examples of hydrogen bonding in 
Section 2.5B.) Hydrogen bonds are much weaker than typical covalent bonds. The 
strength of hydrogen bonds in water and in solutions is difficult to measure directly but 
it is estimated to be about 20 kj mol -1 . 

H — O — H + H — O — H 

O — H 






AH f = -20 kJ mol -1 (2J) 

About 20 kj mol -1 of heat is given off when hydrogen-bonded water molecules 
form in water under standard conditions. (Recall that standard conditions are 1 atm 
pressure and a temperature of 25°C.) This value is the standard enthalpy of formation 
(AHf). It means that the change in enthalpy when hydrogen bonds form is about -20 kj 
per mole of water. This is equivalent to saying that +20 kj mol -1 of heat energy is re- 
quired to disrupt hydrogen bonds between water molecules — the reverse of the reaction 
shown in Reaction 2.1. This value depends on the type of hydrogen bond. In contrast, 
the energy required to break a covalent O — H bond in water is about 460 kj mol -1 , and 
the energy required to break a covalent C — H bond is about 410 kj mol -1 . Thus, the 
strength of hydrogen bonds is less than 5% of the strength of typical covalent bonds. 
Hydrogen bonds are weak interactions compared to covalent bonds. 

Orientation is important in hydrogen bonding. A hydrogen bond is most stable when 
the hydrogen atom and the two electronegative atoms associated with it (the two oxygen 
atoms, in the case of water) are aligned, or nearly in line, as shown in Figure 2.3. Water 
molecules are unusual because they can form four O — H — O aligned hydrogen bonds 
with up to four other water molecules (Figure 2.4). They can donate each of their two hy- 
drogen atoms to two other water molecules and accept two hydrogen atoms from two 
other water molecules. Each hydrogen atom can participate in only one hydrogen bond. 

The three-dimensional interactions of liquid water are difficult to study but much 
has been learned by examining the structure of ice crystals (Figure 2.5). In the common 
form of ice, every molecule of water participates in four hydrogen bonds, as expected. 
Each of the hydrogen bonds points to the oxygen atom of an adjacent water molecule 
and these four adjacent hydrogen-bonded oxygen atoms occupy the vertices of a tetra- 
hedron. This arrangement is consistent with the structure of water shown in Figure 2.1 

Figure 2.3 ► 

Hydrogen bonding between two water mole- 
cules. A partially positive (8®) hydrogen 
atom of one water molecule attracts the par- 
tially negative (25®) oxygen atom of a sec- 
ond water molecule, forming a hydrogen 
bond. The distances between atoms of two 
water molecules in ice are shown. Hydrogen 
bonds are indicated by dashed lines high- 
lighted in yellow, as shown here and 
throughout the book. 

0.28 nm 

2.2 Hydrogen Bonding in Water 


except that the bond angles are all equal (109.5°). This is because the polarity of individual 
water molecules, which distorts the bond angles, is canceled by the presence of hydrogen 
bonds. The average energy required to break each hydrogen bond in ice has been esti- 
mated to be 23 kj mol -1 , making those bonds a bit stronger than those formed in water. 

The ability of water molecules in ice to form four hydrogen bonds and the strength 
of these hydrogen bonds give ice an unusually high melting point because a large 
amount of energy, in the form of heat, is required to disrupt the hydrogen-bonded lat- 
tice of ice. When ice melts most of the hydrogen bonds are retained by liquid water. 
Each molecule of liquid water can form up to four hydrogen bonds with its neighbors 
but most participate in only two or three at any given moment. This means that the 
structure of liquid water is less ordered than that of ice. The fluidity of liquid water is 
primarily a consequence of the constantly fluctuating pattern of hydrogen bonding as 
hydrogen bonds break and re-form. At any given time there will be many water mole- 
cules participating in two, three, or four hydrogen bonds with other water molecules. 
There will also be many that participate in only one hydrogen bond or none at all. This 
is a dynamic structure — the average hydrogen bond lifetime in water is only 10 picosec- 
onds (10 -11 s). 

The density of most substances increases upon freezing as molecular motion slows 
and tightly packed crystals form. The density of water also increases as it cools — until it 
reaches a maximum of 1.000 g ml -1 at 4°C (277 K). (This value is not a coincidence. 
Grams are defined as the weight of 1 milliliter of water at 4°C.) Water expands as the 
temperature drops below 4°C. This expansion is caused by the formation of the more 
open hydrogen-bonded ice crystal in which each water molecule is hydrogen-bonded 
rigidly to four others. As a result ice is slightly less dense (0.924 g ml -1 ) than liquid 
water whose molecules can move enough to pack more closely. Because ice is less dense 
than liquid water it floats and water freezes from the top down. This has important bio- 
logical implications since a layer of ice on a pond insulates the creatures below from ex- 
treme cold. 

Two additional properties of water are related to its hydrogen-bonding characteris- 
tics — its specific heat and its heat of vaporization. The specific heat of a substance is the 
amount of heat needed to raise the temperature of 1 gram of the substance by 1°C. This 
property is also called the heat capacity. In the case of water, a relatively large amount of 
heat is required to raise the temperature because each water molecule participates in 
multiple hydrogen bonds that must be broken in order for the kinetic energy of the 
water molecules to increase. The abundance of water in the cells and tissues of all large 
multicellular organisms means that temperature fluctuations within cells are minimized. 

▲ Figure 2.4 

Hydrogen bonding by a water molecule. A 

water molecule can form up to four hydro- 
gen bonds: the oxygen atom of a water mol- 
ecule is the hydrogen acceptor for two hy- 
drogen atoms, and each 0 — H group serves 
as a hydrogen donor. 

▲ Icebergs. Ice floats because it is less 
dense than water. However, it is only slightly 
less dense than water so most of the mass 
of floating ice lies underwater. 

◄ Figure 2.5 

Structure of ice. Water molecules in ice form 
an open hexagonal lattice in which every 
water molecule is hydrogen-bonded to four 
others. The geometrical regularity of these 
hydrogen bonds contributes to the strength 
of the ice crystal. The hydrogen-bonding 
pattern of ice is more regular than that of 
water. The absolute structure of liquid water 
has not been determined. 

32 CHAPTER 2 Water 


Some species can grow and reproduce at temperatures very 
close to 0°C, or even lower. There are cold-blooded fish, for 
example, that survive at ocean temperatures below 0°C (salt 
lowers the freezing point of water). 

At the other extreme are bacteria that live in hot springs 
where the average temperature is above 80°C. Some bacteria 
inhabit the environment around deep ocean thermal vents 
(black smokers) where the average temperature is more than 
100°C. (The high pressure at the bottom of the ocean raises 
the boiling point of water.) 

The record for extreme thermophiles is Strain 121, a 
species of archaebacteria that grows and reproduces at 
121°C! These extreme thermophiles are among the earliest 
branching lineages on the web of life. It’s possible that the 
first living cells arose near deep ocean vents. 

Deep ocean 
vent. ► 

(a) NaCI crystal 

O Chlorine 



X if 

▲ Figure 2.6 

Dissolution of sodium chloride (NaCI) in water. 

(a) The ions of crystalline sodium chloride 
are held together by electrostatic forces, (b) 
Water weakens the interactions between the 
positive and negative ions and the crystal 
dissolves. Each dissolved Na® and Cl® is 
surrounded by a solvation sphere. Only one 
layer of solvent molecules is shown. Interac- 
tions between ions and water molecules are 
indicated by dashed lines. 

This feature is of critical biological importance since the rates of most biochemical reac- 
tions are sensitive to temperature. 

The heat of vaporization of water (-2260 J g -1 ) is also much higher than that of 
many other liquids. A large amount of heat is required to convert water from a liquid 
to a gas because hydrogen bonds must be broken to permit water molecules to dissoci- 
ate from one another and enter the gas phase. Because the evaporation of water 
absorbs so much heat, perspiration is an effective mechanism for decreasing body 

2.3 Water Is an Excellent Solvent 

The physical properties of water combine to make it an excellent solvent. We have al- 
ready seen that water molecules are polar and this property has important conse- 
quences, as we will see below. In addition, water has a low intrinsic viscosity that does 
not greatly impede the movement of dissolved molecules. Finally, water molecules 
themselves are small compared to some other solvents such as ethanol and benzene. 
The small size of water molecules means that many of them can associate with solute 
particles to make them more soluble. 

A. Ionic and Polar Substances Dissolve in Water 

Water can interact with and dissolve other polar compounds and compounds that ion- 
ize. Ionization is associated with the gain or loss of an electron, or an H + ion, giving rise 
to an atom or a molecule that carries a net charge. Molecules that can dissociate to form 
ions are called electrolytes. Substances that readily dissolve in water are said to be 
hydrophilic, or water loving. (We will discuss hydrophobic, or water fearing, substances 
in the next section.) 

Why are electrolytes soluble in water? Recall that water molecules are polar. This 
means they can align themselves around electrolytes so that the negative oxygen atoms 
of the water molecules are oriented toward the cations (positively charged ions) of the 
electrolytes and the positive hydrogen atoms are oriented toward the anions (negatively 
charged ions). Consider what happens when a crystal of sodium chloride (NaCI) dis- 
solves in water (Figure 2.6) The polar water molecules are attracted to the charged ions 
in the crystal. The attractions result in sodium and chloride ions on the surface of the 

2.3 Water Is an Excellent Solvent 33 

crystal dissociating from one another and the crystal begins to dissolve. Because there 
are many polar water molecules surrounding each dissolved sodium and chloride ion, 
the interactions between the opposite electric charges of these ions become much weaker 
than they are in the intact crystal. As a result of its interactions with water molecules, the 
ions of the crystal continue to dissociate until the solution becomes saturated. At this 
point, the ions of the dissolved electrolyte are present at high enough concentrations for 
them to again attach to the solid electrolyte, or crystallize, and an equilibrium is estab- 
lished between dissociation and crystallization. 


There was a time when people believed that the ionic compo- 
sition of blood plasma resembled that of seawater. This was 
supposed to be evidence that primitive organisms lived in the 
ocean and land animals evolved a system of retaining the 
ocean-like composition of salts. 

Careful studies of salt concentrations in the early 20th 
century revealed that the concentration of salts in the ocean 
were much higher than in blood plasma. Some biochemists 
tried to explain this discrepancy by postulating that the com- 
position of blood plasma didn’t resemble the seawater of 
today but it did resemble the composition of ancient seawa- 
ter from several hundred million years ago when multicellu- 
lar animals arose. 

We now know that the saltiness of the ocean hasn’t 
changed very much from the time it first formed over three 
billion years ago. There is no direct connection between the 
saltiness of blood plasma and seawater. Not only are the overall 

v The concentrations of various ions in seawater (blue) and human 
blood plasma (red) are compared. Seawater is much saltier and 
contains much higher proportions of magnesium and sulfates. Blood 
plasma is enriched in bicarbonate (see Section 2.10). 







H Seawater 
H Blood plasma 


L □ , 

l H H 

Na + K + Mg 2+ Ca + CP SO^f HCO“ 3 

concentrations of the major ions (Na + , K + , and CP) very dif- 
ferent but the relative concentrations of various other ionic 
species are even more different. 

The ionic composition of blood plasma is closely mim- 
icked by Ringer’s solution, which also contains lactate as a 
carbon source. Ringer’s solution can be used as a temporary 
substitute for blood plasma when a patient has suffered 
blood loss or dehydration. 

Blood plasma Ringer's 

Na + 

140 mM 

130 mM 

K + 

4 mM 

4 mM 


103 mM 

109 mM 

Ca + 

2 mM 

2 mM 


5 mM 

28 mM 

34 CHAPTER 2 Water 

▲ Figure 2.7 

Structure of glucose. Glucose contains five 
hydroxyl groups and a ring oxygen, each of 
which can form hydrogen bonds with water. 

▲ Figure 2.8 

Diffusion, (a) If the cytoplasm were simply 
made up of water, a small molecule (red) 
would diffuse from one end of a cell to the 
other via a random walk, (b) The average time 
could be about 10 times longer in a crowded 
cytoplasm, with larger molecules (green). 

Each dissolved Na® attracts the negative ends of several water molecules whereas 
each dissolved Cl® attracts the positive ends of several water molecules (Figure 2.6b). 
The shell of water molecules that surrounds each ion is called a solvation sphere and it 
usually contains several layers of solvent molecules. A molecule or ion surrounded by 
solvent molecules is said to be solvated. When the solvent is water, such molecules or 
ions are said to be hydrated. 

Electrolytes are not the only hydrophilic substances that are soluble in water. Any 
polar molecule will have a tendency to become solvated by water molecules. In addi- 
tion, the solubility of many organic molecules is enhanced by formation of hydrogen 
bonds with water molecules. Ionic organic compounds such as carboxylates and proto - 
nated amines owe their solubility in water to their polar functional groups. Other 
groups that confer water solubility include amino, hydroxyl, and carbonyl groups. Mol- 
ecules containing such groups disperse among water molecules with their polar groups 
forming hydrogen bonds with water. 

An increase in the number of polar groups in an organic molecule increases its sol- 
ubility in water. The carbohydrate glucose contains five hydroxyl groups and a ring oxy- 
gen (Figure 2.7) and is very soluble in water (up to 83 grams of glucose can dissolve in 
100 milliliters of water at 17.5°C). Each oxygen atom of glucose can form hydrogen 
bonds with water. We will see in other chapters that the attachment of carbohydrates to 
some otherwise poorly soluble molecules, including lipids and the bases of nucleosides, 
increases their solubility. 

B. Cellular Concentrations and Diffusion 

The inside of a cell can be very crowded as suggested by David GoodselEs drawings 
(Figure 1.17). Consequently, the behavior of solutes in the cytoplasm will be different 
from their behavior in a simple solution of water. One of the most important differ- 
ences is reduction of the diffusion rate inside cells. 

There are three reasons why solutes diffuse more slowly in cytoplasm. 

1. The viscosity of cytoplasm is higher than that of water due to the presence of many 
solutes such as sugars. This is not an important factor because recent measure- 
ments suggest that the viscosity of cytoplasm is only slightly greater than water 
even in densely packed organelles. 

2. Charged molecules bind transiently to each other inside cells and this restricts their 
mobility. These binding effects have a small but significant effect on diffusion rates. 

3. Collisions with other molecules inhibit diffusion due to an effect called molecular 
crowding. This is the main reason why diffusion is slowed in the cytoplasm. 

For small molecules, the diffusion rate inside cells is never more than one-quarter 
the rate in pure water. For large molecules, such as proteins, the diffusion rate in the cy- 
toplasm may be slowed to about 5% to 10% of the rate in water. This slowdown is due 
largely to molecular crowding. 

For an individual molecule, the rate of diffusion in water at 20°C is described by 
the diffusion coefficient (D 2 o jW ). F° r the protein myoglobin, D 2 o jW = 1 1.3 X 1CT 7 cm 2 s -1 . 
From this value we can calculate that the average time to diffuse from one end of a cell 
to the other (~10 /mm) is about 0.44 seconds. 

But this diffusion time represents the diffusion time in pure water. In the crowed 
environment of a typical cell it could take about 10 times longer (4 s). The slower rate is 
due to the fact that a protein like myoglobin will be constantly bumping into other large 
molecules. Nevertheless, 4 seconds is still a short time. It means that most molecules, in- 
cluding smaller metabolites and ions, will encounter each other frequently inside a typ- 
ical cell (Figure 2.8). Recent direct measurements of diffusion inside cells reveal that the 
effects of molecular crowding are less significant than we used to believe. 

C. Osmotic Pressure 

If a solvent-permeable membrane separates two solutions that contain different con- 
centrations of dissolved substances, or solutes, then molecules of solvent will diffuse 
from the less concentrated solution to the more concentrated solution in a process 

2.4 Nonpolar Substances Are Insoluble in Water 35 

called osmosis. The pressure required to prevent the flow of solvent is called osmotic 
pressure. The osmotic pressure of a solution depends on the total molar concentration 
of solute, not on its chemical nature. 

Water- permeable membranes separate the cytosol from the external medium. The 
compositions of intracellular solutions are quite different from those of extracellular 
solutions with some compounds being more concentrated and some less concentrated 
inside cells. In general, the concentrations of solutes inside the cell are much higher 
than their concentrations in the aqueous environment outside the cell. Water molecules 
tend to move across the cell membrane in order to enter the cell and dilute the solution 
inside the cell. The influx of water causes the cell’s volume to increase but this expan- 
sion is limited by the cell membrane. In extreme cases, such as when red blood cells are 
diluted in pure water, the internal pressure causes the cells to burst. Some species (e.g., 
plants and bacteria) have rigid cell walls that prevent the membrane expansion. These 
cells can develop high internal pressures. 

Most cells use several strategies to keep the osmotic pressure from becoming too 
great and bursting the cell. One strategy involves condensing many individual mole- 
cules into a macromolecule. For example, animal cells that store glucose package it as a 
polymer called glycogen which contains about 50,000 glucose residues. If the glucose 
molecules were not condensed into a single glycogen molecule the influx of water nec- 
essary to dissolve each glucose molecule would cause the cell to swell and burst. Another 
strategy is to surround cells with an isotonic solution that negates a net efflux or influx 
of water. Blood plasma, for example, contains salts and other molecules that mimic the 
osmolarity inside red blood cells (see Box 2.2). 

2.4 Nonpolar Substances Are Insoluble in Water 

Hydrocarbons and other nonpolar substances have very low solubility in water because 
water molecules tend to interact with other water molecules rather than with nonpolar 
molecules. As a result, water molecules exclude nonpolar substances forcing them to as- 
sociate with each other. For example, tiny oil droplets that are vigorously dispersed in 
water tend to coalesce to form a single drop thereby minimizing the area of contact be- 
tween the two substances. This is why the oil in a salad dressing separates if you let it sit 
for any length of time before putting it on your salad. 

Nonpolar molecules are said to be hydrophobic, or water fearing, and this phenome- 
non of exclusion of nonpolar substances by water is called the hydrophobic effect. The 
hydrophobic effect is critical for the folding of proteins and the self-assembly of biolog- 
ical membranes. 

The number of polar groups in a molecule affects its solubility in water. Solubility 
also depends on the ratio of polar to nonpolar groups in a molecule. For example, one-, 
two-, and three-carbon alcohols are miscible with water but larger hydrocarbons 
with single hydroxyl groups are much less soluble in water (Table 2.1). In the larger 

Table 2.1 Solubilities of short-chain alcohols in water 



Solubility in water 
(mol/100 g H 2 0 
at 20°C) fl 


CH 3 OH 



ch 3 ch 2 oh 



CH 3 (CH 2 ) 2 OH 



CH 3 (CH 2 ) 3 OH 



CH 3 (CH 2 ) 4 OH 



CH 3 (CH 2 ) 5 OH 



CH 3 (CH 2 ) 6 OH 


a Infinity (oo) indicates that there is no limit to the solubility of the alcohol in water. 

(a) Hypertonic 

(c) Hypotonic 

▲ Hypertonic (a), isotonic (b) and 
hypotonic (c) red blood cells. 


CHAPTER 2 Water 







/ CH2 



ch 2 

/ CHi 

CH 2 

/ CH2 

CH 2 

ch 2 


▲ Figure 2.9 

Sodium dodecyl sulfate (SDS), a synthetic 

molecules, the properties of the nonpolar hydrocarbon portion of the molecule over- 
ride those of the polar alcohol group and limit solubility. 

Detergents, sometimes called surfactants, are molecules that are both hydrophilic 
and hydrophobic. They usually have a hydrophobic chain at least 12 carbon atoms long 
and an ionic or polar end. Such molecules are said to be am phi path ic. Soaps, which are 
alkali metal salts of long- chain fatty acids are one type of detergent. The soap sodium 
palmitate (CH 3 (CH 2 ) 14 COO®Na©), for example, contains a hydrophilic carboxylate 
group and a hydrophobic tail. One of the synthetic detergents most commonly used in 
biochemistry is sodium dodecyl sulfate (SDS) which contains a 12-carbon tail and a 
polar sulfate group (Figure 2.9). 

The hydrocarbon portion of a detergent is soluble in nonpolar organic sub- 
stances and its polar group is soluble in water. When a detergent is spread on the sur- 
face of water a monolayer forms in which the hydrophobic, nonpolar tails of the de- 
tergent molecules extend into the air groups of detergent molecules aggregate into 
micelles while the hydrophilic, ionic heads are hydrated, extending into the water 
(Figure 2.10). When a sufficiently high concentration of detergent is dispersed in 
water rather than layered on the surface. In one common form of micelle, the nonpo- 
lar tails of the detergent molecules associate with one another in the center of the 
structure minimizing contact with water molecules. Because the tails are flexible, the 
core of a micelle is liquid hydrocarbon. The ionic heads project into the aqueous solu- 
tion and are therefore hydrated. Small, compact micelles may contain about 80 to 100 
detergent molecules. 

The cleansing action of soaps and other detergents derives from their ability to trap 
water- insoluble grease and oils within the hydrophobic interiors of micelles. SDS and 
similar synthetic detergents are common active ingredients in laundry detergents. The 
suspension of nonpolar compounds in water by their incorporation into micelles is 
termed solubilization. Solubilizing nonpolar molecules is a different process than dis- 
solving a polar compound. A number of the structures that we will encounter later in 
this book, including proteins and biological membranes, resemble micelles in having 
hydrophobic interiors and hydrophilic surfaces. 

Some dissolved ions such as SCN® (thiocyanate) and C10 4 ® (perchlorate) are 
called chaotropes. These ions are poorly solvated compared to ions such as NH4®, 
S0 4 2 ®, and H 2 P0 4 ^. Chaotropes enhance the solubility of nonpolar compounds in 
water by disordering the water molecules (there is no general agreement on how 
chaotropes do this). We will encounter other examples of chaotropic agents such as the 
guanidinium ion and the nonionic compound urea when we discuss denaturation and 
the three-dimensional structures of proteins and nucleic acids. 

▲ Figure 2.10 

Cross-sectional views of structures formed by detergents in water. Detergents can form mono- 
layers at the air-water interface. They can also form micelles, aggregates of detergent mol- 
ecules in which the hydrocarbon tails (yellow) associate in the water-free interior and the 
polar head groups (blue) are hydrated. 

2.5 Noncovalent Interactions 37 

2.5 Noncovalent Interactions < a ) 

So far in this chapter we have introduced two types of noncovalent interactions — 
hydrogen bonds and hydrophobic interactions. Weak interactions such as these play ex- 
tremely important roles in determining the structures and functions of macromole- 
cules. Weak forces are also involved in the recognition of one macromolecule by 
another and in the binding of reactants to enzymes. 

There are actually four major noncovalent bonds or forces. In addition to hydrogen 
bonds and hydrophobicity there are also charge-charge interactions and van der Waals 
forces. Charge-charge interactions, hydrogen bonds, and van der Waals forces are varia- 
tions of a more general type of force called electrostatic interactions. 

A. Charge-Charge Interactions 

Charge-charge interactions are electrostatic interactions between two charged particles. 
These interactions are potentially the strongest noncovalent forces and can extend over 
greater distances than other noncovalent interactions. The stabilization of NaCl crystals 
by interionic attraction between the sodium (Na©) and chloride (Cl©) ions is an ex- 
ample of a charge-charge interaction. The strength of such interactions in solution de- 
pends on the nature of the solvent. Since water greatly weakens these interactions, the 
stability of macromolecules in an aqueous environment is not strongly dependent on ( b ) 
charge-charge interactions but they do occur. An example of charge-charge interactions 
in proteins is when oppositely charged functional groups attract one another. The inter- 
action is sometimes called a salt bridge and it’s usually buried deep within the hy- 
drophobic interior of a protein where it cant be disrupted by water molecules. The 
most accurate term for such interactions is ion pairing. 

Charge-charge interactions are also responsible for the mutual repulsion of simi- 
larly charged ionic groups. Charge repulsion can influence the structures of individual 
biomolecules as well as their interactions with other, like- charged molecules. 

In addition to their relatively minor contribution to the stabilization of large mole- 
cules, charge-charge interactions play a role in the recognition of one molecule by an- 
other. For example, most enzymes have either anionic or cationic sites that bind oppo- 
sitely charged reactants. 

B. Hydrogen Bonds 

Hydrogen bonds, which are also a type of electrostatic interaction, occur in many 
macromolecules and are among the strongest noncovalent forces in biological systems. 

The strengths of hydrogen bonds such as those between substrates and enzymes and 
those between the bases of DNA are estimated to be about 25-30 kj mol -1 . These hydro- 
gen bonds are a bit stronger than those formed between water molecules (Section 2.2). 
Hydrogen bonds in biochemical molecules are strong enough to confer structural sta- 
bility but weak enough to be broken readily. 

In general, when a hydrogen atom is covalently bonded to a strongly elec- 
tronegative atom, such as nitrogen, oxygen, or sulfur, a hydrogen bond can only 
form when the hydrogen atom lies approximately 0.2 nm from another strongly 
electronegative atom with an unshared electron pair. As previously described in 
the case of hydrogen bonds between water molecules the covalently bonded atom 
(designated D in Figure 2.11a) is the hydrogen donor and the atom that attracts the 
proton (designated A in Figure 2.1 la) is the hydrogen acceptor. The total distance be- 
tween the two electronegative atoms participating in a hydrogen bond is typically be- 
tween 0.27 nm and 0.30 nm. Some common examples of hydrogen bonds are shown 
in Figure 2.11b. 

A hydrogen bond has many of the characteristics of a covalent bond but it is much 
weaker. You can think of a hydrogen bond as a partial sharing of electrons. (Recall that 
in a true covalent bond a pair of electrons is shared between two atoms.) The three atoms 
involved in a hydrogen bond are usually aligned to form a straight line where the center 
of the hydrogen atoms falls directly on a line drawn between the two electronegative 

▲ Salt bridges, (a) One kind of salt bridge, 
(b) Another kind of salt bridge. 

38 CHAPTER 2 Water 


<EHa) ®= 

Covalent Hydrogen 
bond bond 

-0.1 nm -0.2 nm 



,0 — H 

-0=C V 



N // \ / 

c — c c — c 

// \ // \ 

-c N — H-—N c — H 

\ / \ / 

N=C C— N 

\ // \ 

N — H O R 


Guanine H 


Figure 2.12 ▲ 

Hydrogen bonding between the complementary bases guanine and cytosine in DNA. 

/O-H — \ 

\ / 

N _ Ha ____ l0 =c 

/ \ 

\ / 

IN, — H a — 1 0 

/ \ 

\ S 

IN, — H ■ — ■ N 

/ \ 

▲ Figure 2.1 1 

Hydrogen bonds, (a) Hydrogen bonding be- 
tween a — D — H group (the hydrogen donor) 
and an electronegative atom A — (the hydro- 
gen acceptor). A typical hydrogen bond is ap- 
proximately 0.2 nm long, roughly twice the 
length of the covalent bond between hydrogen 
and nitrogen, oxygen, or sulfur. The total dis- 
tance between the two electronegative atoms 
participating in a hydrogen bond is therefore 
approximately 0.3 nm. (b) Examples of bio- 
logically important hydrogen bonds. 

Hydrogen bonding between base pairs 
in double-stranded DNA makes only a 
small contribution to the stability of 
DNA, as described in Section 19.2C. 


Hydrogen bonds between and within 
biological molecules are easily disrupted 
by competition with water molecules. 

atoms. Small deviations from this alignment are permitted but such hydrogen bonds are 
weaker than the standard form. 

All of the functional groups shown in Figure 2.11 are also capable of forming hy- 
drogen bonds with water molecules. In fact, when they are exposed to water they are far 
more likely to interact with water molecules because the concentration of water is so 
high. In order for hydrogen bonds to form between, or within, biochemical macromol- 
ecules the donor and acceptor groups have to be shielded from water. In most cases, this 
shielding occurs because the groups are buried in the hydrophobic interior of the 
macromolecule where water cant penetrate. In DNA, for example, the hydrogen bonds 
between complementary base pairs are in the middle of the double helix (Figure 2.12). 

C. Van der Waals Forces 

The third weak force involves the interactions between permanent or transient dipoles 
of two molecules. These forces are of short range and small magnitude, about 13 kj 
mol -1 and 0.8 kj mol -1 , respectively. 

These electrostatic interactions are called van der Waals forces named after the 
Dutch physicist Johannes Diderik van der Waals. They only occur when atoms are very 
close together. Van der Waals forces involve both attraction and repulsion. The attrac- 
tive forces, also known as London dispersion forces, originate from the infinitesimal di- 
pole generated in atoms by the random movement of the negatively charged electrons 
around the positively charged nucleus. Thus, van der Waals forces are dipolar, or elec- 
trostatic, attractions between the nuclei of atoms or molecules and the electrons of 
other atoms or molecules. The strength of the interaction between the transiently in- 
duced dipoles of nonpolar molecules such as methane is about 0.4 kj mol -1 at an inter- 
nuclear separation of 0.3 nm. Although they operate over similar distances, van der 
Waals forces are much weaker than hydrogen bonds. 

There is also a repulsive component to van der Waals forces. When two atoms are 
squeezed together the electrons in their orbitals repel each other. The repulsion in- 
creases exponentially as the atoms are pressed together and at very close distances it be- 
comes prohibitive. 

The sum of the attractive and repulsive components of van der Waals forces yields 
an energy profile like that in Figure 2.13. At large intermolecular distances the two atoms 
do not interact and there are no attractive or repulsive forces between them. As the atoms 
approach each other (moving toward the left in the diagram) the attractive force in- 
creases. This attractive force is due to the delocalization of the electron cloud around the 
atoms. You can picture this as a shift in electrons around one of the atoms such that the 
electrons tend to localize on the side opposite that of the other approaching atom. This 
shift creates a local dipole where one side of the atom has a slight positive charge and the 
other side has a slight negative charge. The side with the small positive charge attracts the 
other negatively charged atom. As the atoms move even closer together the effect of this 
dipole diminishes and the overall influence of the negatively charged electron cloud be- 
comes more important. At short distances the atoms repel each other. 

2.6 Water is Nucleophilic 39 

The optimal packing distance is the point at which the attractive forces are maxi- 
mized. This distance corresponds to the energy trough in Figure 2.13 and it is equal to 
the sum of the van der Waals radii of the two atoms. When the atoms are separated by 
the sum of their two van der Waals radii they are said to be in van der Waals contact, p 
Typical van der Waals radii of several atoms are shown in Table 2.2. £ 

In some cases, the shift in electrons is influenced by the approach of another atom. m 
This is an induced dipole. In other cases, the delocalization of electrons is a permanent 
feature of the molecule as we saw in the case of water (Section 2.1). These permanent 
dipoles also give rise to van der Waals forces. 

Although individual van der Waals forces are weak, the clustering of atoms 
within a protein, nucleic acid, or biological membrane permits formation of a large 
number of these weak interactions. Once formed, these cumulative weak forces play 
important roles in maintaining the structures of the molecules. For example, the het- 
erocyclic bases of nucleic acids are stacked one above another in double-stranded 
DNA. This arrangement is stabilized by a variety of noncovalent interactions, espe- 
cially van der Waals forces. These forces are collectively known as stacking interac- 
tions (see Chapter 19). 

D. Hydrophobic Interactions 

The association of a relatively nonpolar molecule or group with other nonpolar molecules 
is termed a hydrophobic interaction. Although hydrophobic interactions are sometimes 
called hydrophobic “bonds? this description is incorrect. Nonpolar molecules don’t aggre- 
gate because of mutual attraction but because the polar water molecules surrounding them 
tend to associate with each other rather than with the nonpolar molecules (Section 2.4). 
For example, micelles (Figure 2.10) are stabilized by hydrophobic interactions. 

The hydrogen-bonding pattern of water is disrupted by the presence of a nonpolar 
molecule. Thus, water molecules surrounding a less polar molecule in solution are more 
restricted in their interactions with other water molecules. These restricted water mole- 
cules are relatively immobile, or ordered, in the same way that molecules at the surface 
of water are ordered in the familiar phenomenon of surface tension. However, water 
molecules in the bulk solvent phase are much more mobile, or disordered. In thermo- 
dynamic terms, there is a net gain in the combined entropy of the solvent and the non- 
polar solute when the nonpolar groups aggregate and water is freed from its ordered 
state surrounding the nonpolar groups. 

Hydrophobic interactions, like hydrogen bonds, are much weaker than covalent bonds 
but stronger than van der Waals interactions. For example, the energy required to transfer a 
— CH 2 — group from a hydrophobic to an aqueous environment is about 3 kj mol -1 . 

Although individual hydrophobic interactions are weak, the cumulative effect of 
many hydrophobic interactions can have a significant effect on the stability of a macro - 
molecule. The three-dimensional structure of most proteins, for example, is largely de- 
termined by hydrophobic interactions formed during the spontaneous folding of the 
polypeptide chain. Water molecules are bound to the outside surface of the protein but 
can’t penetrate the interior where most of the nonpolar groups are located. 

All four of the interactions covered here are individually weak compared to cova- 
lent bonds but the combined effect of many such weak interactions can be quite 
strong. The most important noncovalent interactions in biomolecules are shown in 
Figure 2.14. 

▲ Figure 2.13 

Effect of internuclear separation on van der 
Waals forces. Van der Waals forces are 
strongly repulsive at short internuclear dis- 
tances and very weak at long internuclear 
distances. When two atoms are separated by 
the sum of their van der Waals radii, the van 
der Waals attraction is maximal. 

Table 2.2 Van der Waals radii of several 


Radius (nm) 














Weak interactions are individually weak 
but the combined effect of a large number 
of weak interactions is a significant 
organizing force. 

2.6 Water Is Nucleophilic 

In addition to its physical properties, the chemical properties of water are also impor- 
tant in biochemistry because water molecules can react with biological molecules. The 
electron- rich oxygen atom determines much of water’s reactivity in chemical reactions. 
Electron-rich chemicals are called nucleophiles (nucleus lovers) because they seek posi- 
tively charged (electron-deficient) species called electrophiles (electron lovers). Nucle- 
ophiles are either negatively charged or have unshared pairs of electrons. They attack 

40 CHAPTER 2 Water 


/; © 

— C;G H 3 N — 

Charge-charge interaction 
~40 to 200 kJ moH 

\ / 

C=0 H— N 

/ \ 

Hydrogen bond 
~25 to 30 kJ mol -1 

H H 

I I 

— C— H H — C — 

I I 

H H 



H H 

I I 

— C— H H — C — 

van der Waals interaction 
~0.4 to 4kJ mol -1 

\ / 

ch 2 h 2 c 

Hydrophobic interaction 
-3 to 1 0 kJ mol -1 

▲ Figure 2.14 

Typical noncovalent interactions in biomole- 
cules. Charge-charge interactions, hydrogen 
bonds, and van der Waals interactions are 
electrostatic interactions. Hydrophobic inter- 
actions depend on the increased entropy of 
the surrounding water molecules rather than 
on direct attraction between nonpolar 
groups. For comparison, the dissociation en- 
ergy for a covalent bond such as C — H or 
C — C is approximately 340-450 kJ mol -1 . 

R O 

CO 1 11 

^h 3 n — ch— c — nh— ch — c 


+ H,0 





I /° /° 

@ H 3 N — CH — C X + @ H 3 N — CH — C 7 





Figure 2.15 ▲ 

Hydrolysis of a peptide. In the presence of water the peptide bonds in proteins and peptides are 
hydrolyzed. Condensation, the reverse of hydrolysis, is not thermodynamically favored. 

electrophiles during substitution or addition reactions. The most common nucleophilic 
atoms in biology are oxygen, nitrogen, sulfur, and carbon. 

The oxygen atom of water has two unshared pairs of electrons making it nucle- 
ophilic. Water is a relatively weak nucleophile but its cellular concentration is so high 
that one might reasonably expect it to be very reactive. Many macromolecules should be 
easily degraded by nucleophilic attack by water. This is, in fact, a correct expectation. 
Proteins, for example, are hydrolyzed, or degraded, by water to release their monomeric 
units, amino acids (Figure 2.15). The equilibrium for complete hydrolysis of a protein 
lies far in the direction of degradation; in other words, the ultimate fate of all proteins is 
destruction by hydrolysis! 

If there is so much water in cells then why aren’t all biopolymers rapidly degraded? 
Similarly, if the equilibrium lies toward breakdown, how does biosynthesis occur in an 
aqueous environment? Cells avoid these problems in several ways. For example, the 
linkages between the monomeric units of macromolecules, such as the peptide bonds in 
proteins and the ester linkages in DNA, are relatively stable in solution at cellular pH 
and temperature in spite of the presence of water. In this case, the stability of linkages 
refers to their rate of hydrolysis in water and not their thermodynamic stability. 

The chemical properties of water combined with its high concentration mean that 
the Gibbs free energy change for hydrolysis (AG) is negative. This means that all hydrol- 
ysis reactions are thermodynamically favorable. However, the rate of the reactions in- 
side the cell is so slow that macromolecules are not appreciably degraded by sponta- 
neous hydrolysis during the average lifetime of a cell. It is important to keep in mind the 
distinction between the preferred direction of a reaction, as indicated by the Gibbs free 
energy change, and the rate of the reaction, as indicated by the rate constant (Section 
1.4D). The key concept is that because of the activation energy there is no direct corre- 
lation between the rate of a reaction and the final equilibrium values of the reactants 
and products. 

Cells can synthesize macromolecules in an aqueous environment even though 
condensation reactions — the reverse of hydrolysis — are thermodynamically unfavor- 
able. They do this by using the chemical potential energy of ATP to overcome an unfa- 
vorable thermodynamic barrier. Furthermore, the enzymes that catalyze such reactions 
exclude water from the active site where the synthesis reactions occur. These reactions 
usually follow two-step chemical pathways that differ from the reversal of hydrolysis. 
For example, the simple condensation pathway shown in Figure 2.15 is not the path- 
way that is used in living cells because the presence of high concentrations of water 
makes the direct condensation reaction extremely unfavorable. In the first synthetic 
step, which is thermodynamically uphill, the molecule to be transferred reacts with 
ATP to form a reactive intermediate. In the second step, the activated group is readily 

2.7 Ionization of Water 41 


The density of water varies with tempera- 
ture. It is defined as 1.00000 g/ml at 
3.98°C. The density is 0.99987 at 0°C and 
0.99707 at 25°C. 

The molecular mass of the most 
common form of water is M r =18.01056. 
The concentration of pure water at 
3.98°C is 55.5 M (1000 | 18.01). 

Many biochemical reactions in- 
volve water as either a reactant or a 
product and the high concentration of 
water will affect the equilibrium of the 


There is a difference between the rate of 
a reaction and whether it is 
thermodynamically favorable. Biological 
molecules are stable because the rate of 
spontaneous hydrolysis is slow. 

transferred to the attacking nucleophile. In Chapter 22 we will see that the reactive in- 
termediate in protein synthesis is an amino acyl -tRNA that is formed in a reaction in- 
volving ATP. The net result of the biosynthesis reaction is to couple the condensation 
to the hydrolysis of ATP. 

The role of ATP in coupled reactions is 
described in Section 10.7. 

2.7 Ionization of Water 

One of the important properties of water is its slight tendency to ionize. Pure water 
contains a low concentration of hydronium ions (H 3 0®) and an equal concentration of 
hydroxide ions (OH®). The hydronium and hydroxide ions are formed by a nucleophilic 
attack of oxygen on one of the protons in an adjacent water molecule. 


o— H <- 

H,0 + H,0 


1 © 

. O . + ^o — H 


H 3 0® + OH° 

( 2 . 2 ) 

The red arrows in Reaction 2.2 show the movement of pairs of electrons. These ar- 
rows are used to depict reaction mechanisms and we will encounter many such dia- 
grams throughout this book. One of the free pairs of electrons on the oxygen will con- 
tribute to formation of a new O — H covalent bond between the oxygen atom of the 
hydronium ion and a proton (H®) abstracted from a water molecule. An O — H cova- 
lent bond is broken in this reaction and the electron pair from that bond remains asso- 
ciated with the oxygen atom of the hydroxide ion. 

Note that the atoms in the hydronium ion contain eleven positively charged pro- 
tons (eight in the oxygen atom and three hydrogen protons) and ten negatively charged 
electrons (a pair of electrons in the inner orbital of the oxygen atom, one free electron 
pair associated with the oxygen atom, and three pairs in the covalent bonds). This results 
in a net positive charge which is why we refer to it as an ion (cation). The positive charge 
is usually depicted as if it were associated with the oxygen atom but, in fact, it is distrib- 
uted partially over the hydrogen atoms as well. Similarly, the hydroxide ion (anion) 
bears a net negative charge because it contains ten electrons whereas the nuclei of the 
oxygen and hydrogen atoms have a total of only nine positively charged protons. 


CHAPTER 2 Water 

The density of water varies with the 
temperature (Box 2.2) and so does the 
ion product. The differences aren’t sig- 
nificant in the temperature ranges that 
we normally encounter in living cells, 
so we assume that the value 10" 14 
applies at all temperatures. (See 
Problem 17 at the end of this chapter.) 

The ionization reaction is a typical reversible reaction. The protonation and depro- 
tonation reactions take place very quickly. Hydroxide ions have a short lifetime in water 
and so do hydronium ions. Even water molecules themselves have only a transient exis- 
tence. The average water molecule is thought to exist for about one millisecond (10 _3 s) 
before losing a proton to become a hydroxide ion or gaining a proton to become a hy- 
dronium ion. Note that the lifetime of a water molecule is still eight orders of magni- 
tude (10 8 ) greater than the lifetime of a hydrogen bond. 

Hydronium ( H 3 0©) ions are capable of donating a proton to another ion. Such 
proton donors are referred to as acids according to the Bronsted-Lowry concept of 
acids and bases. In order to simplify chemical equations we often represent the hydro- 
nium ion as simply H© (free proton or hydrogen ion) to reflect the fact that it is a major 
source of protons in biochemical reactions. The ionization of water can then be de- 
picted as a simple dissociation of a proton from a single water molecule. 

H 2 0 H© + OH© (2.3) 

Reaction 2.3 is a convenient way to show the ionization of water but it does not re- 
flect the true structure of the proton donor which is actually the hydronium ion. Reac- 
tion 2.3 also obscures the fact that the ionization of water is actually a bimolecular reac- 
tion involving two separate water molecules as shown in Reaction 2.2. Fortunately, the 
dissociation of water is a reasonable approximation that does not affect our calculations 
or our understanding of the properties of water. We will make use of this assumption in 
the rest of the book. 

Hydroxide ions can accept a proton and be converted back into water molecules. 
Proton acceptors are called bases. Water can function as either an acid or a base as Reac- 
tion 2.2 demonstrates. 

The ionization of water can be analyzed quantitatively. Recall that the concentra- 
tions of reactants and products in a reaction will eventually reach an equilibrium where 
there is no net change in concentration. The ratio of these equilibrium concentrations 
defines the equilibrium constant (K eq ). In the case of ionization of water, 

Keq = [H « K eq [H 2 0] = [H@][OH©] (2.4) 

The equilibrium constant for the ionization of water has been determined under stan- 
dard conditions of pressure (1 atm) and temperature (25°C). Its value is 1.8 X 1(T 16 M. We 
are interested in knowing the concentrations of protons and hydroxide ions in a solu- 
tion of pure water since these ions participate in many biochemical reactions. These 
values can be calculated from Equation 2.4 if we know the concentration of water 
( [H 2 0]) at equilibrium. Pure water at 25°C has a concentration of approximately 55.5 M 
(see Box 2.2). A very small percentage of water molecules will dissociate to form H© 
and OH© when the ionization reaction reaches equilibrium. This will have a very small 
effect on the final concentration of water molecules at equilibrium. We can simplify our 
calculations by assuming that the concentration of water in Equation 2.4 is 55.5 M. 
Substituting this value, and that of the equilibrium constant, gives 

(1.8 X 10“ 16 M)(55.5 M) = 1.0 X 10“ 14 M 2 = [H©][OH e ] (2.5) 

The product obtained by multiplying the proton and hydroxide ion concentrations 
([H©] [OH©]) is called the ion product for water. This is a constant designated K w (the 
ion product constant for water). At 25°C the value of K w is 

K w = [H©][OH©] = 1.0 X 10“ 14 M 2 (2.6) 

It is a fortunate coincidence that this is a nice round number rather than some awkward 
fraction because it makes calculations of ion concentrations much easier. Pure water is 

2.8 The pH Scale 43 

electrically neutral, so its ionization produces an equal number of protons and hydroxide 
ions [H©] = [OH] . In the case of pure water, Equation 2.6 can therefore be rewritten as 

K w = [H©] 2 = 1.0 X 1(T 14 M 2 (2.7) 

Taking the square root of the terms in Equation 2.7 gives 

[H©] = 1.0X1 O -7 M (2.8) 

Since [H©] = [OH©], the ionization of pure water produces 1CT 7 M H© and 1(T 7 M 
OH©. Pure water and aqueous solutions that contain equal concentrations of H© and 
OH© are said to be neutral. Of course, not all aqueous solutions have equal concentra- 
tions of H© and OH©. When an acid is dissolved in water [H©] increases and the solu- 
tion is described as acidic. Note that when an acid is dissolved in water the concentra- 
tion of protons increases while the concentration of hydroxide ions decreases. This is 
because the ion product constant for water (K w ) is unchanged (i.e., constant) and the 
product of the concentrations of H© and OH© must always be 1.0 X 10 -14 M 2 under 
standard conditions (Equation 2.5). Dissolving a base in water decreases [H©] and in- 
creases [OH©] above 1.0 X 10 7 M producing a basic, or alkaline, solution. 

2.8 The pH Scale 

Many biochemical processes — including the transport of oxygen in the blood, the catal- 
ysis of reactions by enzymes, and the generation of metabolic energy during respiration 
or photosynthesis — are strongly affected by the concentration of protons. Although the 
concentration of H® (or H 3 0©) in cells is small relative to the concentration of water, 
the range of [H©] in aqueous solutions is enormous so it is convenient to use a loga- 
rithmic quantity called pH as a measure of the concentration of H©. pH is defined as the 
negative logarithm of the concentration of H©. 

pH = -log[H@] = log^T_ (2.9) 

In pure water [H©] = [OH©] = 1.0 X 10 -7 M (Equations 2.7 and 2.8). As men- 
tioned earlier, pure water is said to be “neutral” with respect to total ionic charge since 
the concentrations of the positively charged hydrogen ions and the negatively charged 
hydroxide ions are equal. Neutral solutions have a pH value of 7.0 (the negative value of 
log 10 -7 is 7.0). Acidic solutions have an excess of H© due to the presence of dissolved 
solute that supplies H© ions. In a solution of 0.01 M HC1, for example, the concentra- 
tion of H© is 0.01 M (10 -2 M) because HC1 dissociates completely to H© and Cl©. The 
pH of such a solution is -log 10 -2 = 2.0. Thus, the higher the concentration of H©, the 
lower the pH of the solution. The pH scale is logarithmic, so a change in pH of one unit 
corresponds to a 10-fold change in the concentration of H©. 

Aqueous solutions can also contain fewer H© ions than pure water resulting in a 
pH above 7. In a solution of 0.01 M NaOH, for example, the concentration of OH© is 
0.01 M (10 -2 M) because NaOH, like HC1, is 100% dissociated in water. The H© ions 
derived from the ionization of water will combine with the hydroxide ions from NaOH 
to re-form water molecules. This affects the equilibrium for the ionization of water 
(Reaction 2.3). The resulting solution is very basic because of the low concentration of 
protons. The actual pH can be determined from the ion product of water, K w (Equa- 
tion 2.6), by substituting the concentration of hydroxide ions. Since the product of the 
OH© and H© concentrations is 10 -14 M it follows that the H© concentration in a solution 
of 1(T 2 M OH© is 10 -12 M. The pH of the solution is 12. Table 2.3 shows this relationship 
between pH and the concentrations of H© and OH©. 

Basic solutions have pH values greater than 7.0 and acidic solutions have lower 
pH values. Figure 2.16 illustrates the pH values of various common solutions. 

Figure 2.16 ► 

pH values for various fluids at 25°C. Lower values correspond to acidic fluids; higher values corre- 
spond to basic fluids. 

Table 2.3 Relation of [H©] and [0H @ ] to pH 








IO -14 


10 _1 

IO" 13 


10“ 2 

io- 12 


1(T 3 

IO -11 


10 4 

IQ- 10 


10 5 

10 9 


10“ 6 

10“ 8 


io- 7 

io- 7 


10“ 8 

10“ 6 


IO" 9 

10 5 


10~ 10 

IO 4 


10 H1 

IO 3 


10- 12 

10“ 2 


io- 13 

10 _1 


IO -14 



hydroxide (1 M) 


Ammonia (1 M) 

Milk of Magnesia 

















Cow's milk 

Coffee (black) 




Lemon juice 


acid (1 M) 

44 CHAPTER 2 Water 

tOO Slr ips 

pH indicator attipa ntm-btoilinB 

pH pH o - 14 

▲ pH strips. The approximate pH of solutions 
can be determined in the lab by placing a 
drop on a pH strip. Various indicators are 
bound to a matrix that is affixed to a plastic 
strip. The indicators change color at different 
concentrations of H®, and the combination of 
various colors gives a more or less accurate 
reading of the pH. The strips shown here cover 
all pH readings from 0 to 14 but other pH 
strips can be used to cover narrower ranges. 


pH is the negative logarithm of the proton 
(H©) concentration. 


The term pH was first used in 1909 by S 0 ren 
Peter Lauritz Sorensen, director of the Carls- 
berg Laboratories in Denmark. Sorensen never 
mentioned what the little “p” stood for (the ££ H” 
is obviously hydrogen). Many years later, some 
of the scientists who write chemistry textbooks 
began to associate the little “p” with the words 
power or potential. This association, as it turns 
out, is based on a rather tenuous connection in 
some of Sorensens early papers. A recent inves- 
tigation of the historical records by Jens G. 
Noby suggests that the little “p” was an arbitrary 
choice based on Sorensen’s use of p and q to 
stand for unknown variables in much the same 
way that we might use x and y today. 

No matter what the historical origin, it’s 
important to remember that the symbol pH 
now stands for the negative logarithm of the 
hydrogen ion concentration. 

▲ Spren Peter Lauritz Sprensen 
( 1868 - 1939 ) 

Accurate measurements of pH are routinely made using a pH meter, an instrument 
that incorporates a selectively permeable glass electrode that is sensitive to [H©]. 
Measurement of pH sometimes facilitates the diagnosis of disease. The normal pH of 
human blood is 7.4 — frequently referred to as physiological pH. The blood of pa- 
tients suffering from certain diseases, such as diabetes, can have a lower pH, a condi- 
tion called acidosis. The condition in which the pH of the blood is higher than 7.4, 
called alkalosis, can result from persistent, prolonged vomiting (loss of hydrochloric 
acid from the stomach) or from hyperventilation (excessive loss of carbonic acid as 
carbon dioxide). 


Weak acids and weak bases are 
compounds that only partially dissociate 
in water. 

2.9 Acid Dissociation Constants of Weak Acids 

Acids and bases that dissociate completely in water, such as hydrochloric acid and 
sodium hydroxide, are called strong acids and strong bases. Many other acids and bases, 
such as the amino acids from which proteins are made and the purines and pyrimidines 
from DNA and RNA, do not dissociate completely in water. These substances are 
known as weak acids and weak bases. 

In order to understand the relationship between acids and bases let us consider the 
dissociation of HC1 in water. Recall from Section 2.7 that we define an acid as a mole- 
cule that can donate a proton and a base as a proton acceptor. Acids and bases always 
come in pairs since for every proton donor there must be a proton acceptor. Both sides 
of the dissociation reaction will contain an acid and a base. Thus, the equilibrium reac- 
tion for the complete dissociation of HC1 is 

HCI + H 2 0 Cl 0 + H 3 0® (2.10) 

acid base base acid 

HCI is an acid because it can donate a proton. In this case, the proton acceptor is 
water which is the base in this equilibrium reaction. On the other side of the equilib- 
rium are Cl© and the hydronium ion, H 3 0©. The chloride ion is the base that corre- 
sponds to HCI after it has given up its proton. Cl© is called the conjugate base of HCI 
which indicates that it is a base (i.e., can accept a proton) and is part of an acid-base 
pair (i.e., HC1/C1©). Similarly, H 3 0© is the acid on the right-hand side of the equi- 
librium because it can donate a proton. H 3 0© is the conjugate acid of H 2 0. Every base 

2.9 Acid Dissociation Constants of Weak Acids 


has a corresponding conjugate acid and every acid has a corresponding conjugate 
base. Thus, HC1 is the conjugate acid of Cl® and H 2 0 is the conjugate base of H 3 0®. 
Note that H 2 0 is the conjugate acid of OH® if we are referring to the H 2 0/0H® 
acid-base pair. 

In most cases throughout this book we will simplify reactions by ignoring the con- 
tribution of water and representing the hydronium ion as a simple proton. 

HCI H© + Cl© ( 2 . 11 ) 

This is a standard convention in biochemistry but, on the surface, it seems to violate the 
rule that both sides of the equilibrium reaction should contain a proton donor and a 
proton acceptor. Students should keep in mind that in such reactions the contributions 
of water molecules as proton acceptors and hydronium ions as the true proton donors 
are implied. In almost all cases we can safely ignore the contribution of water. This is the 
same principle that we applied to the reaction for the dissociation of water (Section 2.7) 
which we simplified by ignoring the contribution of one of the water molecules. 

The reason why HC1 is such a strong acid is because the equilibrium shown in Re- 
action 2.11 is shifted so far to the right that HC1 is completely dissociated in water. In 
other words, HC1 has a strong tendency to donate a proton when dissolved in water. 
This also means that the conjugate base, Cl®, is a very weak base because it will rarely 
accept a proton. 

Acetic acid is the weak acid present in vinegar. The equilibrium reaction for the 
ionization of acetic acid is 


The contribution of water is implied in 
most acid/base dissociation reactions. 

CH 3 COOH H© + CH 3 COO© ( 2 . 12 ) 

Acetic acid Acetate anion 

(weak acid) (conjugate base) 

We have left out the contribution of water molecules in order to simplify the reaction. 
We see that the acetate ion is the conjugate base of acetic acid. (We can also refer to acetic 
acid as the conjugate acid of the acetate ion.) 

The equilibrium constant for the dissociation of a proton from an acid in water is 
called the acid dissociation constant, K a . When the reaction reaches equilibrium, which 
happens very rapidly, the acid dissociation constant is equal to the concentration of the 
products divided by the concentration of the reactants. For Reaction 2.12 the acid dis- 
sociation constant is 

[H©][CH 3 COO©] 
a [CH 3 COOH] 

The K a value for acetic acid at 25°C is 1.76 x 10 -5 M. Because K a values are numeri- 
cally small and inconvenient in calculations it is useful to place them on a logarithmic 
scale. The parameter p K a is defined by analogy with pH. 

P K a = -log K a = log - 7 - (2.14) 

K a 

A pH value is a measure of the acidity of a solution and a p K a value is a measure of 
the acid strength of a particular compound. The p K a of acetic acid is 4.8. 

When dealing with bases we need to consider their protonated forms in order to 
use Equation 2.13. These conjugate acids are very weak acids. In order to simplify calcu- 
lations and make easy comparisons we measure the equilibrium constant (K a ) for the 
dissociation of a proton from the conjugate acid of a weak base. For example, the am- 
monium ion (NH 4 ®) can dissociate to form the base ammonia (NH 3 ) and H®. 

NH 4 ® NH 3 + H® 


The acid dissociation constant (K a ) for this equilibrium is a measure of the strength of 
the base (ammonia, NH 3 ) in aqueous solution. The K a values for several common sub- 
stances are listed in Table 2.4. 


CHAPTER 2 Water 

Table 2.4 Dissociation constants and pK a values of weak acids in aqueous 
solutions at 25°C 


K a(M) 

pK a 

HCOOH (Formic acid) 

1.77 X 10 “ 4 


CH 3 COOH (Acetic acid) 

1.76 X 1(T 5 


CH 3 CHOHCOOH (Lactic acid) 

1.37 X 10 “ 4 


H 3 PO 4 (Phosphoric acid) 

7.52 X 10 “ 3 


H 2 P0 4 ® (Dihydrogen phosphate ion) 


HPO 4 (Monohydrogen phosphate ion) 

6.23 X 1(T 8 


2.20 X 10 “ 13 


H 2 C0 3 (Carbonic acid) 

4.30 X 10 “ 7 


HCO 3 0 (Bicarbonate ion) 

5.61 X 10~ n 


NH 4 © (Ammonium ion) 

5.62 X 10 “ 10 


CH 3 NH 3 © (Methylammonium ion) 

2.70 X 10 -11 


From Equation 2.13 we see that the FC a for acetic acid is related to the concentra- 
tion of H® and to the ratio of the concentrations of the acetate ion and undissociated 
acetic acid. If we represent the conjugate acid as HA and the conjugate base as A® then 
taking the logarithm of such equations gives the general equation for any acid-base 

HA H© + A© log K a = log 

[H©][A 0 ] 



Since log(xy) = log x + logy, Equation 2.16 can be rewritten as 


log K a = log[H©] + log 



Rearranging Equation 2.17 gives 

-log[H©] = -log K a + log 

[A Q ] 




The pH of a solution of a weak acid or 
base at equilibrium can be calculated by 
combining the p K a of the ionization 
reaction and the final concentrations of 
the proton acceptor and proton donor 

The negative logarithms in Equation 2.18 have already been defined as pH and p K a 
(Equations 2.9 and 2.14, respectively). Thus, 

, [A e ] 

pH - pK ‘ + Io 3 IhaI 



PH = p K a 

+ log 

[Proton acceptor] 
[Proton donor] 

( 2 . 20 ) 

Equation 2.20 is one version of the Henderson-Hasselbalch equation. It defines the 
pH of a solution in terms of the pFC a of the weak acid form of the acid-base pair and 
the logarithm of the ratio of concentrations of the dissociated species (conjugate base) 
to the protonated species (weak acid). Note that the greater the concentration of the 
proton acceptor (conjugate base) relative to that of the proton donor (weak acid), 
the lower the concentration of H® and the higher the pH. (Remember that pH is the 
negative log of H® concentration. A high concentration of H® means low pH.) This 

2.9 Acid Dissociation Constants of Weak Acids 47 

makes intuitive sense since the concentration of A© is identical to the concentration of 
H© in simple dissociation reactions. If more HA dissociates the concentration of A© 
will be higher and so will the concentration of H©. When the concentrations of a weak 
acid and its conjugate base are exactly the same the pH of the solution is equal to the 
p K a of the acid (since the ratio of concentrations equals 1.0, and the logarithm of 1.0 
equals zero). 

The Henderson-Hasselbalch equation is used to determine the final pH of a weak 
acid solution once the dissociation reaction reaches equilibrium as illustrated in Sample 
Calculation 2.1 for acetic acid. These calculations are more complicated than those in- 
volving strong acids such as HC1. As noted in Section 2.8, the pH of an HC1 solution is 
easily determined from the amount of HC1 that is present since the final concentration 
of H© is equal to the initial concentration of HC1 when the solution is made up. In con- 
trast, weak acids are only partially dissociated in water so it makes sense that the pH de- 
pends on the acid dissociation constant. The pH decreases (more H©) as more weak 
acid is added to water but the increase in H© is not linear with initial HA concentra- 
tion. This is because the numerator in Equation 2.16 is the product of the H© and A© 

The Henderson-Hasselbalch equation applies to other acid-base combinations as 
well and not just to those involving weak acids. When dealing with a weak base, for ex- 
ample, the numerator and denominator of Equation 2.20 become [weak base] and 
[conjugate acid], respectively. The important point to remember is that the equation 
refers to the concentration of the proton acceptor divided by the concentration of the 
proton donor. 

The pK a values of weak acids are determined by titration. Figure 2.17 shows the 
titration curve for acetic acid. In this example, a solution of acetic acid is titrated by 
adding small aliquots of a strong base of known concentration. The pH of the solution 
is measured and plotted versus the number of molar equivalents of strong base added 
during the titration. Note that since acetic acid has only one ionizable group (its car- 
boxyl group) only one equivalent of a strong base is needed to completely titrate acetic 
acid to its conjugate base, the acetate anion. When the acid has been titrated with one- 
half an equivalent of base the concentration of undissociated acetic acid exactly equals 
the concentration of the acetate anion. The resulting pH, 4.8, is thus the experimentally 
determined p for acetic acid. 

Constructing an ideal titration curve is a useful exercise for reinforcing the rela- 
tionship between pH and the ionization state of a weak acid. You can use the Hender- 
son-Hasselbalch equation to calculate the pH that results from adding increasing amounts 
of a strong base such as NaOH to a weak acid such as the imidazolium ion p K a = 7.0. 
Adding base converts the imidazolium ion to its conjugate base, imidazole (Figure 2.18). 
The shape of the titration curve is easy to visualize if you calculate the pH when the 
ratio of conjugate base to acid is 0.01, 0.1, 1, 10, and 100. Calculate pH values at other 
ratios until you are satisfied that the curve is relatively flat near the midpoint and 
steeper at the ends. 

Similarly shaped titration curves can be obtained for each of the five monoprotic 
acids (acids having only one ionizable group) listed in Table 2.4. All would exhibit the 
same general shape as Figure 2.17 but the inflection point representing the midpoint of 
titration (one-half an equivalent titrated) would fall lower on the pH scale for a 
stronger acid (such as formic acid or lactic acid) and higher for a weaker acid (such as 
ammonium ion or methylammonium ion). 

Titration curves of weak acids illustrate a second important use of the Henderson- 
Hasselbalch equation. In this case, the final pH is the result of mixing the weak acid 
(HA) and a strong base (OH©). The base combines with H© ions to form water mole- 
cules, H 2 0. This reduces the concentration of H© and raises the pH. As the titration of 
the weak acid proceeds it dissociates in order to restore its equilibrium with OH© and 
H 2 0. The net result is that the final concentration of A© is much higher, and the con- 
centration of HA is much lower, than when we are dealing with the simple case where 
the pH is determined only by the dissociation of the weak acid in water (i.e., a solution 
of HA in H 2 0). 

▲ Figure 2.17 

Titration of acetic acid (CH 3 C00H) with aque- 
ous base (OH®). There is an inflection point 
(a point of minimum slope) at the midpoint 
of the titration, when 0.5 equivalent of base 
has been added to the solution of acetic 
acid. This is the point at which 
[CH 3 COOH] = [CH 3 C00 e ] and pH = pK a . 
The p K a of acetic acid is thus 4.8. At the 
endpoint, all the molecules of acetic acid 
have been titrated to the conjugate base, 

— H 



Imidazolium ion 


H © 

P K a = 7.0 


▲ Figure 2.18 

Titration of the imidazolium ion. 

48 CHAPTER 2 Water 

Figure 2.19 ► 

Titration curve for H 3 P0 4 . Three inflection 
points (at 0.5, 1.5, and 2.5 equivalents of 
strong base added) correspond to the three 
p K a values for phosphoric acid (2.2, 7.2, 
and 12.7). 

▲ Cola beverages contain phosphoric acid 
in order to make the drink more acidic. The 
concentration of phosphoric acid is about 
1 mM. This concentration should make the 
pH about 3 in the absence of any other 
ingredients that may contribute to acidity. 

Third midpoint 

[hpo 4 ®] = [po 4 ®] 

Phosphoric acid (H 3 PO 4 ) is a polyprotic acid. It contains three different hydrogen 
atoms that can dissociate to form H© ions and corresponding conjugate bases with one, 
two, or three negative charges. The dissociation of the first proton occurs readily and is 
associated with a large acid dissociation constant of 7.53 x 10 -3 M and a pl<f a of 2.2 in 
aqueous solution. The dissociations of the second and third protons occur progressively 
less readily because they have to dissociate from a molecule that is already negatively 

Phosphoric acid requires three equivalents of strong base for complete titration 
and three p FC a values are evident from its titration curve (Figure 2.19). The three pFC a 
values reflect the three equilibrium constants and thus the existence of four possible 
ionic species (conjugate acids and bases) of inorganic phosphate. At physiological pH 
(7.4) the predominant species of inorganic phosphate are H 2 P0 4 © and HP0 4 ©. At 
pH 7.2 these two species exist in equal concentrations. The concentrations of H 3 P0 4 
and P0 4 © are so low at pH 7.4 that they can be ignored. This is generally the case for a 
minor species when the pH is more than two units away from its p K a . 



HO— P— OH 





II 0 

HO— P—O 

p k 2 




© II © 

o— p— o 







H © 

p/C 3 



( 2 . 21 ) 


© II © 

O— P— o 



H © 

Many biologically important acids and bases, including the amino acids described 
in Chapter 3, have two or more ionizable groups. The number of p K a values for such 
substances is equal to the number of ionizable groups. The p K a values can be experi- 
mentally determined by titration. 

2.9 Acid Dissociation Constants of Weak Acids 


Sample Calculation 2.1 CALCULATING THE pH OF WEAK ACID 


Q: What is the pH of a solution of 0.1 M acetic acid? 

A: The acid dissociation constant of acetic acid is 1.76 X 10 -5 M. Acetic acid disso- 
ciates in water to form acetate and H® . We need to determine [H® ] when the reaction 
reaches equilibrium. 

Let the final H® concentration be represented by the unknown quantity x. At equi- 
librium the concentration of acetate ion will also be x and the final concentration of 
acetic acid will be [0.1 M — x]. Thus, 

, 7 , x 1(r s . [H e ][CH 3 C00 9 ] _ 

[CHjCOOH] (0.1 - x) 

rearranging gives 

1 .76 X 1 0“ 6 - 1 .76 X 1 0“ 5 x = x 2 
x 2 + 1 .76 X 1 0“ 5 x - 1 .76 X 1 0“ 6 = 0 

This equation is a typical quadratic equation of the form ax 2 + bx + c = 0, where 
a = 1, b = 1.76 X 10 -5 , and c = —1.76 X 10 -6 . Solve for x using the standard 

-b ± V(b 2 - 4oc) 

-1.76 X 10“ 5 ± V((1 .76 X 10“ 5 ) 2 - 4(1.76 X 10“ 6 )) 

- 2 
x = 0.001 32 or -0.001 35 (reject the negative answer) 

The hydrogen ion concentration is 0.00132 M and the pH is 

pH = -log[H®] = -log(0.001 32) = -(-2.88) = 2.9 

Note that the contribution of hydrogen ions from the dissociation of water 110 7 2 is 
several orders of magnitude lower than the concentration of hydrogen ions from 
acetic acid. It is standard practice to ignore the ionization of water in most calcula- 
tions as long as the initial concentration of weak acid is greater than 0.001 M. 

The amount of acetic acid that dissociates to form H® and CH 3 COO® is 0.0013 M 
when the initial concentration is 0.1 M. This means that only 1.3% of the acetic acid 
molecules dissociate and the final concentration of acetic acid l[CH 3 COOH]2 is 
98.7% of the initial concentration. In general, the percent dissociation of dilute 
solutions of weak acids is less than 10% and it is a reasonable approximation to 
assume that the final concentration of the acid form is the same as its initial concen- 
tration. This approximation has very little effect on the calculated pH and it has the 
advantage of avoiding quadratic equations. 

Assuming that the concentration of CH3COOH at equilibrium is 0.1 M and the con- 
centration of H® is x, 

x 2 

K a = 1.76 X 1CT 5 = — x = 1 .33 X 1 0 -3 
pH = — log( 1 .33 X 1CT 3 ) = 2.88 = 2.9 

CH 2 OH 

hoh 2 c — c — NH, 


ch 2 oh 

▲ Tris buffers. Tris, or tris (hydroxymethyl) 
aminomethane, is a common buffer in 
biochemistry labs. Its p K a of 8.06 makes 
it ideal for preparation of buffers in the 
physiological range. 

50 CHAPTER 2 Water 

2.10 Buffered Solutions Resist Changes in pH 

▲ Figure 2.20 

Buffer range of acetic acid. For CH 3 COOH + 
CH 3 COO 0 the p K a is 4.8 and the most ef- 
fective buffer range is from pH 3.8 to pH 

If the pH of a solution remains nearly constant when small amounts of strong acid or 
strong base are added the solution is said to be buffered. The ability of a solution to resist 
changes in pH is known as its buffer capacity. Inspection of the titration curves of acetic 
acid (Figure 2.17) and phosphoric acid (Figure 2.19) reveals that the most effective 
buffering, indicated by the region of minimum slope on the curve, occurs when the 
concentrations of a weak acid and its conjugate base are equal — in other words, when 
the pH equals the p K a . The effective range of buffering by a mixture of a weak acid and 
its conjugate base is usually considered to be from one pH unit below to one pH unit 
above the p K a . 

Most in vitro biochemical experiments involving purified molecules, cell extracts, 
or intact cells are performed in the presence of a suitable buffer to ensure a stable pH. A 
number of synthetic compounds with a variety of p K a values are often used to prepare 
buffered solutions but naturally occurring compounds can also be used as buffers. For 
example, mixtures of acetic acid and sodium acetate (p K a = 4.8) can be used for the pH 
range from 4 to 6 (Figure 2.20) and mixtures of KH 2 P0 4 and K 2 HP0 4 (pFC a = 7.2) can be 
used in the range from 6 to 8. The amino acid glycine (p K a = 9.8) is often used in the 
range from 9 to 11. 

When preparing buffers the acid solution (e.g., acetic acid) supplies the protons 
and some of the protons are taken up by combining with the conjugate base (e.g., ac- 
etate). The conjugate base is added as a solution of a salt (e.g., sodium acetate). The salt 
dissociates completely in solution providing free conjugate base and no protons. 
Sample Calculation 2.2 illustrates one way to prepare a buffer solution. 

Sample Calculation 2.2 BUFFER PREPARATION 

Q: Acetic acid has a p K a of 4.8. How many milliliters of 0.1 M acetic acid and 0.1 M 
sodium acetate are required to prepare 1 liter of 0.1 M buffer solution having a pH 
of 5.8? 

A: Substitute the values for the p K a and the desired pH into the Henderson-Hassel- 
balch equation (Equation 2.20). 


4.8 + 


[Acetic acid] 

Solve for the ratio of acetate to acetic acid. 


[Acetic acid] 

= 5.8 - 4.8 = 1.0 

[Acetate] = 1 0 [Acetic acid] 

For each volume of acetic acid, 10 volumes of acetate must be added (making a total 
of 1 1 volumes of the two ionic species). Multiply the proportion of each component 
by the desired volume. 

Acetic acid needed: A x 1000 ml = 91 ml 



Acetate needed: — x 1000 ml = 909 ml 


Note that when the ratio of [conjugate base] to [conjugate acid] is 10:1, the pH is ex- 
actly one unit above the p/v a . If the ratio were. 1:10, the pH would be one unit below 
the p K a . 

2.10 Buffered Solutions Resist Changes in pH 51 

► Figure 2.21 

Percentages of carbonic acid and its conjugate 
base as a function of pH. In an aqueous 
solution at pH 7.4 (the pH of blood) the 
concentrations of carbonic acid (H 2 C0 3 ) and 
bicarbonate (HCO 3 0 ) are substantial, but 
the concentration of carbonate (C0 3 ©) is 

An excellent example of buffer capacity is found in the blood plasma of mammals, 
which has a remarkably constant pH. Consider the results of an experiment that compares 
the addition of an aliquot of strong acid to a volume of blood plasma with a similar addi- 
tion of strong acid to either physiological saline (0.15 M NaCl) or water. When 1 milliliter 
of 10 M HC1 (hydrochloric acid) is added to 1 liter of physiological saline or water that 
is initially at pH 7.0 the pH is lowered to 2.0 (in other words, [H©] from HC1 is diluted 
to 10 -2 M). However, when 1 milliliter of 10 M HC1 is added to 1 liter of human blood 
plasma at pH 7.4 the pH is lowered to only 7.2 — impressive evidence for the effective- 
ness of physiological buffering. 

The pH of blood is primarily regulated by the carbon dioxide-carbonic acid-bicar- 
bonate buffer system. A plot of the percentages of carbonic acid (H 2 C0 3 ) and its conju- 
gate base as a function of pH is shown in Figure 2.21. Note that the major components 
at pH 7.4 are carbonic acid and the bicarbonate anion (HC0 3 ©). 

The buffer capacity of blood depends on equilibria between gaseous carbon diox- 
ide (which is present in the air spaces of the lungs), aqueous carbon dioxide (which is 
produced by respiring tissues and dissolved in blood), carbonic acid, and bicarbonate. 
As shown in Figure 2.21, the equilibrium between bicarbonate and its conjugate base, 
carbonate (C0 3 ©), does not contribute significantly to the buffer capacity of blood be- 
cause the p K a of bicarbonate is 10.2 — too far from physiological pH to have an effect on 
the buffering of blood. 

The first of the three relevant equilibria of the carbon dioxide-carbonic acid-bicar- 
bonate buffer system is the dissociation of carbonic acid to bicarbonate. 

H 2 C0 3 H© + HCO 3 0 ( 2 . 22 ) 

This equilibrium is affected by a second equilibrium in which dissolved carbon dioxide 
is in equilibrium with its hydrated form, carbonic acid. 

C0 2 (aqueous) + H 2 0 H 2 C0 3 (2.23) 

These two reactions can be combined into a single equilibrium reaction where the acid 
is represented as C0 2 dissolved in water: 

C0 2 (aqueous) + H 2 Q H© + HC0 3 © (2.24) 

Aqueous phase 
of blood cells 
passing through 
in lung 

hco 3 ° 



h 2 c 

:o 3 

h 2 o^ 


C0 2 


C0 2 


Air space 
in lung 

The p K a of the acid is 6.4. 

Finally, C0 2 (gaseous) is in equilibrium with C0 2 (aqueous). 

C0 2 (gaseous) C0 2 (aqueous) (2.25) 

The regulation of the pH of blood afforded by these three equilibria is shown 
schematically in Figure 2.22. When the pH of blood falls due to a metabolic process that 
produces excess H© the concentration of H 2 C0 3 increases momentarily but H 2 C0 3 

▲ Figure 2.22 

Regulation of the pH of blood in mammals. The 

pH of blood is controlled by the ratio of 
[HC0 3 ®] topC0 2 in the air spaces of the 
lungs. When the pH of blood decreases due 
to excess H®, pC0 2 increases in the lungs, 
restoring the equilibrium. When the concen- 
tration of HCO 3 0 rises because the pH of 
blood increases, C0 2 (gaseous) dissolves in 
the blood, again restoring the equilibrium. 

52 CHAPTER 2 Water 

rapidly loses water to form dissolved C0 2 (aqueous) which enters the gaseous phase in 
the lungs and is expired as C0 2 (gaseous). An increase in the partial pressure of C0 2 
( pC0 2 ) in the air expired from the lungs thus compensates for the increased hydrogen 
ions. Conversely, when the pH of the blood rises the concentration of HC0 3 ® increases 
transiently but the pH is rapidly restored as the breathing rate changes and the C0 2 
(gaseous) in the lungs is converted to C0 2 (aqueous) and then to H 2 C0 3 in the capillar- 
ies of the lungs. Again, the equilibrium of the blood buffer system is rapidly restored by 
changing the partial pressure of C0 2 in the lungs. 

Within cells, both proteins and inorganic phosphate contribute to intracellular 
buffering. Hemoglobin is the strongest buffer in blood cells other than the carbon diox- 
ide-carbonic acid-bicarbonate buffer. As mentioned earlier, the major species of inor- 
ganic phosphate present at physiological pH are H 2 P0 4 ® and HP0 4 © reflecting the 
second p K a (p K 2 ) value for phosphoric acid, 7.2. 


1. The water molecule has a permanent dipole because of the un- 
even distribution of charge in O — H bonds and their angled 

2. Water molecules can form hydrogen bonds with each other. Hy- 
drogen bonding contributes to the high specific heat and heat of 
vaporization of water. 

3. Because it is polar, water can dissolve ions. Water molecules form 
a solvation sphere around each dissolved ion. Organic molecules 
may be soluble in water if they contain ionic or polar functional 
groups that can form hydrogen bonds with water molecules. 

4. The hydrophobic effect is the exclusion of nonpolar substances by 
water molecules. Detergents, which contain both hydrophobic and 
hydrophilic portions, form micelles when suspended in water; these 
micelles can trap insoluble substances in a hydrophobic interior. 
Chaotropes enhance the solubility of nonpolar compounds in water. 

5. The major noncovalent interactions that determine the structure and 
function of biomolecules are electrostatic interactions and hydropho- 
bic interactions. Electrostatic interactions include charge-charge 
interactions, hydrogen bonds, and van der Waals forces. 

6. Under cellular conditions, macromolecules do not spontaneously 
hydrolyze, despite the presence of high concentrations of water. 
Specific enzymes catalyze their hydrolysis, and other enzymes cat- 
alyze their energy- requiring biosynthesis. 

7. At 25°C, the product of the proton concentration ( [H®] ) and the 
hydroxide concentration ([OH®]) is 1.0 x 1CT 14 M 2 , a constant 
designated K w (the ion-product constant for water). Pure water 
ionizes to produce 1(T 7 M H® and 1(T 7 M OH®. 

8. The acidity or basicity of an aqueous solution depends on the 
concentration of H® and is described by a pH value, where pH is 
the negative logarithm of the hydrogen ion concentration. 

9. The strength of a weak acid is indicated by its pK a value. The 
Henderson-Hasselbalch equation defines the pH of a solution of 
weak acid in terms of the p K a and the concentrations of the weak 
acid and its conjugate base. 

10. Buffered solutions resist changes in pH. In human blood, a con- 
stant pH of 7.4 is maintained by the carbon dioxide-carbonic 
acid-bicarbonate buffer system. 


1. The side chains of some amino acids possess functional groups 
that readily form hydrogen bonds in aqueous solution. Draw the 
hydrogen bonds likely to form between water and the following 
amino acid side chains: 




ch 2 oh 

CH 2 C(0)NH 2 

— CH- 

N = 

V-N — H 

2. State whether each of the following compounds is polar, whether 
it is amphipathic, and whether it readily dissolves in water. 

(a) HO — CH 2 — CH — CH 2 — OH 





(b) ch 3 ich 2 2 14 — ch 2 — opo 3 

Hexadecanyl phosphate 

(c) CH 3 — 1CH 2 2 10 — COO 0 


(d) h 3 n — ch 2 — coo g 


3. Osmotic lysis is a gentle method of breaking open animal cells to 
free intracellular proteins. In this technique, cells are suspended 
in a solution that has a total molar concentration of solutes much 
less than that found naturally inside cells. Explain why this tech- 
nique might cause cells to burst. 

4. Each of the following molecules is dissolved in buffered solutions 
of: (a) pH = 2 and (b) pH = 11. For each molecule, indicate the 
solution in which the charged species will predominate. (Assume 
that the added molecules do not appreciably change the pH of the 

(a) Phenyl lactic acid pK a = 4 


Problems 53 

(b) Imidazole pK a = 1 


(c) O-methyl-y-aminobutyrate pK a = 9.5 


ii © 

ch 3 occh 2 ch 2 ch 2 — nh 3 

(d) Phenyl salicylate pK a = 9.6 

5. Use Figure 2. 16 to determine the concentration of H® and OH® in: 

(a) tomato juice 

(b) human blood plasma 

(c) 1 M ammonia 

6. The interaction between two (or more) molecules in solution can 
be mediated by specific hydrogen bond interactions. Phorbol es- 
ters can act as a tumor promoter by binding to certain amino 
acids that are part of the enzyme protein kinase C (PKC). Draw 
the hydrogen bonds expected in the complex formed between the 
tumor promoter phorbol and the glycine portion of PKC: 
— NHCH 2 C(0)— 

The nitrogen atom of MOPS can be protonated (pK a = 7.2). The 
carboxyl group of SHS can be ionized (pK a = 5.5). Calculate the 
ratio of basic to acidic species for each buffer at pH 6.5. 

10 . Many phosphorylated sugars (phosphate esters of sugars) are 
metabolic intermediates. The two ionizable — OH groups of the 
phosphate group of the monophosphate ester of ribose (ribose 5- 
phosphate) have pK a values of 1.2 and 6.6. The fully protonated 
form of a-D-ribose 5 -phosphate has the structure shown below. 



(a) Draw, in order, the ionic species formed upon titration of 
this phosphorylated sugar from pH 0.0 to pH 10.0. 

(b) Sketch the titration curve for ribose 5-phosphate. 

11. Normally, gaseous C0 2 is efficiently expired in the lungs. Under 
certain conditions, such as obstructive lung disease or emphy- 
sema, expiration is impaired. The resulting excess of C0 2 in the 
body may lead to respiratory acidosis, a condition in which excess 
acid accumulates in bodily fluids. How does excess C0 2 lead to 
respiratory acidosis? 

12. Organic compounds in the diets of animals are a source of basic 
ions and may help combat nonrespiratory types of acidosis. Many 
fruits and vegetables contain salts of organic acids that can be me- 
tabolized, as shown below for sodium lactate. Explain how the 
salts of dietary acids may help alleviate metabolic acidosis. 



7 . What is the concentration of a lactic acid buffer (pK a = 3.9) that 
contains 0.25 M CH 3 CH(OH)COOH and 0.15 M CH 3 CH(OH) 
COO®? What is the pH of this buffer? 

8. You are instructed to prepare 100 ml of a 0.02 M sodium phos- 
phate buffer, pH 7.2, by mixing 50 ml of solution A (0.02M 
Na 2 HP0 4 ) and 50 ml of solution B (0.02 M NaH 2 P0 4 ). Refer to 
Table 2.4 to explain why this procedure provides an effective 
buffer at the desired pH and concentration. 

9. What are the effective buffering ranges of MOPS (3-(N-mor- 
pholino)propanesulfonic acid) and SHS (sodium hydrogen succi- 


HOOC — CH 2 — CH 2 — COO® Na® 


CH 3 — CH — COO®Na® + 30 2 » 

Na® + 2 C0 2 + HC0 3 ® + 2H 2 0 

13. Absorption of food in the stomach and intestine depends on the 
ability of molecules to penetrate the cell membranes and pass into 
the bloodstream. Because hydrophobic molecules are more likely 
to be absorbed than hydrophilic or charged molecules, the ab- 
sorption of orally administered drugs may depend on their pK a 
values and the pH in the digestive organs. Aspirin (acetylsalicylic 
acid) has an ionizable carboxyl group (pK a = 3.5). Calculate the 
percentage of the protonated form of aspirin available for absorp- 
tion in the stomach (pH = 2.0) and in the intestine (pH = 5.0). 




14 . What percent of glycinamide, ®H 3 NCH 2 CONH 2 (pK a = 8.20) is 
unprotonated at (a) pH 7.5, (b) pH 8.2, and (c) pH 9.0? 

54 CHAPTER 2 Water 

15 . Refer to the following table and titration curve to determine which 
compound from the table is illustrated by the titration curve. 




pK 3 

Phosphoric acid 




Acetic acid 


Succinic acid 



Boric acid 






16 . Predict which of the following substances are soluble in water. 

CH 2 OH 

about 4.0 x 10 13 . What is the actual neutral pH for extremophiles 
living at 0°C and 100°C? 

18. What is the approximate pH of a solution of 6 M HC1? Why doesn’t 
the scale in Figure 2.16 accommodate the pH of such a solution? 

Selected Readings 


Chaplin, M. F. (2001). Water, its importance 
to life. Biochem. and Mol. Biol. Education 

Dix, J. A. and Verkman, A. S. (2008). Crowding ef- 
fects on diffusion in solutions and cells. Annu. Rev. 
Biophys. 37:247-263. 

Stillinger, F. H. (1980). Water revisited. Science 

Verkman, A. S. (2001). Solute and macromolecular 
diffusion in cellular aqueous compartments. 

Trends Biochem Sci. 27:27-33. 

Noncovalent Interactions 

Fersht, A. R. (1987). The hydrogen bond in molec- 
ular recognition. Trends Biochem. Sci. 12:301-304. 

Frieden, E. (1975). Non-covalent interactions. 

J. Chem. Educ. 52:754-761. 

Tanford, C. (1980). The Hydrophobic Effect: 
Formation of Micelles and Biological Membranes, 
2nd ed. (New York: John Wiley & Sons). 

Biochemical Calculations 

Montgomery, R., and Swenson, C. A. (1976). 
Quantitative Problems in Biochemical Sciences, 2nd 
ed. (San Francisco: W. H. Freeman). 

Segel, I. H. (1976). Biochemical Calculations: 
How to Solve Mathematical Problems in General 
Biochemistry, 2nd ed. (New York: John Wiley 
& Sons). 

pH and Buffers 

Stoll, V. S., and Blanchard, J. S. (1990). Buffers: 
principles and practice. Methods Enzymol. 

Norby, J. G. (2000). The origin and 
meaning of the little p in pH. 

Trends Biochem. Sci. 25:36-3 7. 

Amino Acids and the Primary 
Structures of Proteins 

T he relationship between structure and function is a fundamental part of biochem- 
istry. In spite of its importance, we sometimes forget to mention structure -func- 
tion relationships, thinking that the concept is obvious from the examples. In this 
book we will try and remind you from time to time how the study of structure leads to a 
better understanding of function. This is especially important when studying proteins. 

In this chapter and the next one we will cover the basic rules of protein structure. In 
Chapters 5 and 6, we will learn how enzymes work and how their structure contributes 
to the mechanisms of enzyme action. 

Before beginning, let’s review the various kinds of proteins. The following list, al- 
though not exhaustive, covers most of the important biological functions of proteins: 

1. Many proteins function as enzymes, the biochemical catalysts. Enzymes catalyze 
nearly all reactions that occur in living organisms. 

2. Some proteins bind other molecules for storage and transport. For example, hemo- 
globin binds and transports 0 2 and C0 2 in red blood cells and other proteins bind 
fatty acids and lipids. 

3. Several types of proteins serve as pores and channels in membranes, allowing for 
the passage of small, charged molecules. 

4. Some proteins, such as tubulin, actin, and collagen, provide support and shape to 
cells and hence to tissues and organisms. 

5. Assemblies of proteins can do mechanical work, such as the movement of flagella, 
the separation of chromosomes at mitosis, and the contraction of muscles. 

6. Many proteins play a role in information flow in the cell. Some are involved in 
translation whereas others play a role in regulating gene expression by binding to 
nucleic acids. 

7. Some proteins are hormones, which regulate biochemical activities in target cells or 
tissues; other proteins serve as receptors for hormones. 

"Amino acids are literally raining 
down from the sky and if that's 
not a big deal then I don't know 
what is. " 

Max Bernstein, 
SETI Institute 


The functions of biochemical molecules 
can only be understood by knowing their 

Top: L-Arginine, one of the 20 common amino acids. 


56 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 


There are many different kinds of proteins 
with many different roles in metabolism 
and cell structure. 

8. Proteins on the cell surface can act as receptors for various ligands and as modifiers 
of cell-cell interactions. 

9 . Some proteins have highly specialized functions. For example, antibodies defend 
vertebrates against bacterial and viral infections, and toxins, produced by bacteria, 
can kill larger organisms. 

We begin our study of proteins by exploring the structures and chemical properties 
of their constituent amino acids. In this chapter we will also discuss the purification, 
analysis, and sequencing of polypeptides. 

▲ Spindle fibers. Spindle fibers (green) help 
separate chromosomes at mitosis. The fibers 
are microtubules formed from the structural 
protein tubulin. 


© I <=, 

h 3 n — ch — coo u 


© I q 

h 3 n — ch — coo u 

5 2 1 

▲ Numbering conventions for amino acids. In 

traditional names, the carbon atoms adjacent 
to the carboxyl group are identified by the 
Greek letters a, /3, y, etc. In the official 
IUPAC/IUBMB chemical names or systematic 
names, the carbon atom in the carboxyl group 
is number 1 and the adjacent carbons are 
numbered sequentially. Thus, the a-carbon 
atom in traditional names is the carbon 2 
atom in systematic names. 

The IUPAC-IUBMB website for 
Nomenclature and Symbolism for 
Amino Acids and Peptides is: www. 

3.1 General Structure of Amino Acids 

All organisms use the same 20 amino acids as building blocks for the assembly of protein 
molecules. These 20 amino acids are called the common , or standard , amino acids. De- 
spite the limited number of amino acids, an enormous variety of different polypeptides 
can be produced by connecting the 20 common amino acids in various combinations. 

Amino acids are called amino acids because they are amino derivatives of car- 
boxylic acids. In the 20 common amino acids the amino group and the carboxyl group 
are bonded to the same carbon atom: the ct-carbon atom. Thus, all of the standard 
amino acids found in proteins are a-amino acids. Two other substituents are bound to 
the a- carbon — a hydrogen atom and a side chain (R) that is distinctive for each amino 
acid. In the chemical names of amino acids, carbon atoms are identified by numbers, 
beginning with the carbon atom of the carboxyl group. [The correct chemical name, or 
systematic name, follows rules established by the International Union of Pure and Ap- 
plied Chemistry (IUPAC) and the International Union of Biochemistry and Molecular 
Biology (IUBMB).] If the R group is — CH 3 then the systematic name for that amino 
acid would be 2-aminopropanoic acid. (Propanoic acid is CH 3 — CH 2 — COOH.) The 
trivial name for CH 3 — CH(NH 2 ) — COOH is alanine. The old nomenclature uses Greek 
letters to identify the a-carbon atom and the carbon atoms of the side chain. This 
nomenclature identifies the carbon atom relative to the carboxyl group so the carbon 
atom of the carboxyl group is not specified, unlike in the systematic nomenclature, 
where this carbon atom is number 1 in the numbering system. Biochemists have tradi- 
tionally used the old, alternate nomenclature. 

Inside a cell, under normal physiological conditions, the amino group is protonated 
( — NH 3 ©) because the p K a of this group is close to 9. The carboxyl group is ionized 
( — COO®) because the p K a of that group is below 3, as we saw in Section 2.9. Thus, in 
the physiological pH range of 6.8 to 7.4, amino acids are zwitterions, or dipolar ions, even 
though their net charge may be zero. We will see in Section 3.4 that some side chains can 
also ionize. Biochemists always represent the structures of amino acids in the form that is 
biologically relevant which is why you will see the zwitterions in the following figures. 

Figure 3.1a shows the general three-dimensional structure of an amino acid. Figure 
3.1b shows a ball-and-stick model of a representative amino acid, serine, whose side 
chain is — CH 2 OH. The first carbon atom that’s directly bound to the carboxylate car- 
bon is the a - carbon so the other carbon atoms of a side chain are sequentially labeled /3, 
y, 8 , and s, referring to carbons 3, 4, 5, and 6, respectively, in the newer convention. The 
systematic name for serine is 2-amino-3-hydroxypropanoic acid. 

In 19 of the 20 common amino acids the a-carbon atom is chiral, or asymmetric, 
since it has four different groups bonded to it. The exception is glycine, whose R group 
is simply a hydrogen atom. The molecule is not chiral because the a-carbon atom is 
bonded to two identical hydrogen atoms. The 19 chiral amino acids can therefore exist 
as stereoisomers. Stereoisomers are compounds that have the same molecular formula 
but differ in the arrangement, or configuration, of their atoms in space. The two 
stereoisomers are distinct molecules that can’t be easily converted from one form to the 
other since a change in configuration requires the breaking of one or more bonds. 
Amino acid stereoisomers are nonsuperimposable mirror images called enantiomers. 
Two of the 19 chiral amino acids, isoleucine and threonine, have two chiral carbon 
atoms each. Isoleucine and threonine can each form four different stereoisomers. 

3.1 General Structure of Amino Acids 57 




H 3 N' 





u-Carboxylate group 


-Side chain 

# u-Carbon O Nitrogen 

O Carbon O Oxygen 

O Hydrogen 




By convention, the mirror-image pairs of amino acids are designated D (for dextro, 
from the Latin dexter , “right”) and L (for levo, from the Latin laevus , “left”). The config- 
uration of the amino acid in Figure 3.1a is L and that of its mirror image is D. To assign 
the stereochemical designation, one draws the amino acid vertically with its a-carboxy- 
late group at the top and its side chain at the bottom, both pointing away from the 
viewer. In this orientation, the a-amino group of the L isomer is on the left of the a-car- 
bon, and that of the D isomer is on the right, as shown in Figure 3.2. (The four atoms at- 
tached to the a - carbon occupy the four corners of a tetrahedron much like the bonding 
of hydrogen atoms to oxygen in water, as shown in Figure 2.4.) 

The 19 chiral amino acids used in the assembly of proteins are all of the L configu- 
ration, although a few D-amino acids occur in nature. By convention, amino acids are 
assumed to be in the L configuration unless specifically designated D. Often it is conven- 
ient to draw the structures of L- amino acids in a form that is stereochemically uncom- 
mitted, especially when a correct stereochemical representation is not critical to a given 

The fact that all living organisms use the same standard amino acids in protein 
synthesis is evidence that all species on Earth are descended from a common ancestor. 
Like modern organisms, the last common ancestor (LCA) must have used L-amino 

(a) (b) 

Mirror plane Mirror plane 

◄ Figure 3.1 

Two representations of an L-amino acid at neu- 
tral pH. (a) General structure. An amino acid 
has a carboxylate group (whose carbon atom 
is designated C-l), an amino group, a hydro- 
gen atom, and a side chain (or R group), all 
attached to C-2 (the a-carbon). Solid 
wedges indicate bonds above the plane of 
the paper; dashed wedges indicate bonds 
below the plane of the paper. The blunt 
ends of wedges are nearer the viewer than 
the pointed ends, (b) Ball-and-stick model 
of serine (whose R group is ( — CH 2 OH). 

▲ Meteorites and amino acids. The Murchi- 
son meteorite fell in 1969 near Murchison, 
Australia. There are many similar carbona- 
ceous meteorites and many of them contain 
spontaneously formed amino acids, includ- 
ing some of the common amino acids found 
in proteins. These amino acids are found in 
the meteorites as almost equal mixtures of 
the l and d configurations. 



0 u-Carbon O Nitrogen 

O Carbon O Oxygen 

O Hydrogen 


© ? 

H 3 N — C — H 


See Section 8.1 for a more complete 
description of the convention for 
displaying stereoisomers (Fischer 

ch 2 oh 

ch 2 oh 



◄ Figure 3.2 

Mirror-image pairs of amino acids, (a) Ball- 
and-stick models of L-serine and D-serine. 
Note that the two molecules are not identi- 
cal; they cannot be superimposed, (b) L-Ser- 
ine and D-serine. The common amino acids 
all have the l configuration. 


CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

acids and not D-amino acids. Mixtures of L- and D-amino acids are formed under con- 
ditions that mimic those present when life first arose on Earth 4 billion years ago and 
both enantiomers are found in meteorites and in the vicinity of stars. It is not known 
how or why primitive life forms selected L- amino acids from the presumed mixture of 
the enantiomers present when life first arose. It’s likely that the first proteins were com- 
posed of a small number of simple amino acids and selection of L-amino acids over 
D-amino acids was a chance event. Modern living organisms do not select L-amino acids 
from a mixture because only the L-amino acids are synthesized in sufficient quantities. 
Thus, the predominance of L-amino acids in modern species is due to the evolution of 
metabolic pathways that produce L-amino acids and not D-amino acids (Chapter 17). 

3.2 Structures of the 20 Common Amino Acids 

The structures of the 20 amino acids commonly found in proteins are shown in the fol- 
lowing figures as Fischer projections. In Fischer projections, horizontal bonds at a chiral 
center extend toward the viewer and vertical bonds extend away (as in Figures 3.1 and 3.2). 
Examination of the structures reveals considerable variation in the side chains of the 20 
amino acids. Some side chains are nonpolar and thus hydrophobic whereas others are 
polar or ionized at neutral pH and are therefore hydrophilic. The properties of the side 
Some nonstandard amino acids are chains greatly influence the overall three-dimensional shape, or conformation, of a pro- 

described in Section 3.3. tein. F° r example, most of the hydrophobic side chains of a water-soluble protein fold 

into the interior giving the protein a compact, globular shape. 

Both the three-letter and one-letter abbreviations for each amino acid are shown in 
the figures. The three-letter abbreviations are self-evident but the one-letter abbreviations 
are less obvious. Several amino acids begin with the same letter so other letters of the 
alphabet have to be used in order to provide a unique label; for example, threonine = T, 
tyrosine = Y, and tryptophan = W. These labels have to be memorized. 


Amino acids can spontaneously convert from the D configu- 
ration to the L configuration and vice versa. This is a chemical 
reaction that usually proceeds through a carbanion interme- 

The racemization reaction is normally very slow but it 
can be sped up at high temperatures. For example, the half- 
life for conversion of L-aspartate to D-aspartate is about 30 
days at 100°C. The half-life of this reaction at 37°C is about 
350 years and at 18°C its about 50,000 years. 

The amino acid composition of mammalian tooth 
enamel can be used to determine the age of a fossil if the av- 
erage temperature of the environment is known or can be es- 
timated. When the amino acids are first synthesized they are 
exclusively of the L configuration. Over time, the amount of 
the D enantiomer increases and the d/l ratio can be measured 
very precisely. 

Fossil dating by measuring amino acid racemization has 
been superceded by more reliable methods but it’s an inter- 
esting example of a slow chemical reaction. Some organisms 
contain specific racemases that catalyze the interconversion 
of an L-amino acid and a D-amino acid; for example, bacteria 
have alanine racemase for converting L- alanine to D- alanine 
(see Section 8.7B). These enzymes catalyze thousands of re- 
actions per second. 


© 1 

H 3 N — C — H 

° c 0 


— > i 

h«-c—nh 3 © 




L-Amino acid 


D-Amino acid 

▲ The Badegoule Jaw from a stone age juvenile. Homo sapiens 
(Natural History Museum, Lyon, France) 

3.2 Structures of the 20 Common Amino Acids 59 

It is important to learn the structures of the standard amino acids because we refer 
to them frequently in the chapters on protein structure, enzymes, and protein synthesis. 
In the following sections we have grouped the standard amino acids by their general 
properties and the chemical structures of their side chains. The side chains fall into the 
following chemical classes: aliphatic, aromatic, sulfur-containing, alcohols, positively 
charged, negatively charged, and amides. Of the 20 amino acids five are further classi- 
fied as highly hydrophobic (blue) and seven are classified as highly hydrophilic (red). 
Understanding the classification of the R groups will simplify memorizing the struc- 
tures and names. 

A. Aliphatic R Groups 

Glycine (Gly, G) is the smallest amino acid. Since its R group is simply a hydrogen atom, 
the a-carbon of glycine is not chiral. The two hydrogen atoms of the a-carbon of 
glycine impart little hydrophobic character to the molecule. We will see that glycine 
plays a unique role in the structure of many proteins because its side chain is small 
enough to fit into niches that cannot accommodate any other amino acid. 

Four amino acids — alanine (Ala, A), valine (Val, V), leucine (Leu, L), and the struc- 
tural isomer of leucine, isoleucine (lie, I) — have saturated aliphatic side chains. The side 
chain of alanine is a methyl group whereas valine has a three-carbon branched side 
chain and leucine and isoleucine each contain a four-carbon branched side chain. Both 
the a- and /3-carbon atoms of isoleucine are asymmetric. Because isoleucine has two 
chiral centers, it has four possible stereoisomers. The stereoisomer used in proteins 
is called L-isoleucine and the amino acid that differs at the /3-carbon is called 
L-alloisoleucine (Figure 3.3). The other two stereoisomers are D-isoleucine and 

Alanine, valine, leucine, and isoleucine play an important role in establishing and 
maintaining the three-dimensional structures of proteins because of their tendency to 
cluster away from water. Valine, leucine, and isoleucine are known collectively as the 
branched chain amino acids because their side chains of carbon atoms contain 
branches. All three amino acids are highly hydrophobic and they share biosynthesis and 
degradation pathways (Chapter 17). 

Proline (Pro, P) differs from the other 19 amino acids because its three-carbon side 
chain is bonded to the nitrogen of its a-amino group as well as to the a-carbon creating 
a cyclic molecule. As a result, proline contains a secondary rather than a primary amino 
group. The heterocyclic pyrrolidine ring of proline restricts the geometry of polypep- 
tides sometimes introducing abrupt changes in the direction of the peptide chain. The 
cyclic structure of proline makes it much less hydrophobic than valine, leucine, and 

B. Aromatic R Groups 

Phenylalanine (Phe, F), tyrosine (Tyr, Y), and tryptophan (Trp, W) have side chains 
with aromatic groups. Phenylalanine has a hydrophobic benzyl side chain. Tyrosine is 
structurally similar to phenylalanine except that the para hydrogen of phenylalanine is 
replaced in tyrosine by a hydroxyl group ( — OH) making tyrosine a phenol. The hy- 
droxyl group of tyrosine is ionizable but retains its hydrogen under normal physiological 
conditions. The side chain of tryptophan contains a bicyclic indole group. Tyrosine and 





© 1 

1 © 

© 1 


H 3 N— C — H 

H^c — NH 3 

H 3 N»- C-*H 

H^c — NH 3 

h 3 c — c— h 

H — C — CH 3 

H — C — CH 3 

HjC — C — H 

oh 2 

CH 2 

CH 2 

oh 2 

ch 3 

ch 3 

ch 3 

ch 3 






k © 




H 3 N — C — H 


H 3 N — C — F 


ch 3 

Glycine [G] 

Alanine [A] 



coo 0 

COO 0 

© 1 

H 3 N— C — H 

© 1 

H 3 N — C — H 


cn 2 


h 3 c / \h 3 


h 3 c / \h 3 

Valine [V] 

Leucine [L] 



COO 0 

COO 0 

© 1 

H 3 N — C — H 

© 1 

H 3 N — C — H 

H 3 C — C — H 

ch 2 



ch 3 

Isoleucine [I] 

Phenylalanine [F] 


Tyrosine [Y] 

Tryptophan [W] 

coo 0 

© I 

H,N — C — H 

/ \ 

H 2 C x x CH 2 

ch 2 

Proline [P] 

◄ Figure 3.3 

Stereoisomers of isoleucine. Isoleucine and 
threonine are the only two common amino 
acids with more than one chiral center. The 
other DL pair of isoleucine isomers is called 
alloleucine. Note that in L-isoleucine the 
— NH 3 © and — CH 3 groups are both on the 
left in this projection, while in D-isoleucine 
they are both on the right, so that 
D-isoleucine and L-isoleucine are 
mirror images. 


CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

Wavelength (nm) 

▲ UV absorbance of proteins. The peak of ab 
sorbance of most proteins peaks at 280 nm. 
Most of the absorbance is due to the pres- 
ence of tryptophan and tyrosine residues in 
the protein. 

© 1 

H 3 N — C — H 


© 1 

H 3 N — C — H 







ch 3 

Methionine [M] 

Cysteine [C] 


© 1 

H,N — C — H 


coo 0 

© 1 

H 3 N — C — H 


H — C — OH 



Serine [S] 

ch 3 

Threonine [T] 

▲ A sulfur bridge. Natural stone bridge, 
Puente del Inca, in Mendoza, Argentina. 
Over the years the bridge has been covered 
with sulfur deposits. 

tryptophan are not as hydrophobic as phenylalanine because their side chains include 
polar groups (Table 3.1, page 62). 

All three aromatic amino acids absorb ultraviolet (UV) light because, unlike 
the saturated aliphatic amino acids, the aromatic amino acids contain delocalized 
7r-electrons. At neutral pH both tryptophan and tyrosine absorb light at a wavelength 
of 280 nm whereas phenylalanine is almost transparent at 280 nm and absorbs light 
weakly at 260 nm. Since most proteins contain tryptophan and tyrosine they will absorb 
light at 280 nm. Absorbance at 280 nm is routinely used to estimate the concentration 
of proteins in solutions. 

C. R Groups Containing Sulfur 

Methionine (Met, M) and cysteine (Cys, C) are the two amino acids whose side chains 
contain a sulfur atom. Methionine contains a nonpolar methyl thioether group in its 
side chain and this makes it one of the more hydrophobic amino acids. Methionine 
plays a special role in protein synthesis because it is almost always the first amino acid in 
a growing polypeptide chain. The structure of cysteine resembles that of alanine with a 
hydrogen atom replaced by a sulfhydryl group ( — SH). 

Although the side chain of cysteine is somewhat hydrophobic, it is also highly reac- 
tive. Because the sulfur atom is polarizable the sulfhydryl group of cysteine can form 
weak hydrogen bonds with oxygen and nitrogen. Moreover, the sulfhydryl group of cys- 
teine residues in proteins can be a weak acid which allows it to lose its proton to become 
a negatively charged thiolate ion. (The p iC a of the sulfhydryl group of the free amino 
acid is 8.3 but this can range from 5-10 in proteins.) 

A compound called cystine can be isolated when some proteins are hydrolyzed. 
Cystine is formed from two oxidized cysteine molecules linked by a disulfide bond 
(Figure 3.4). Oxidation of the sulfhydryl groups of cysteine molecules proceeds most 
readily at slightly alkaline pH values because the sulfhydryl groups are ionized at high pH. 
The two cysteine side chains must be adjacent in three-dimensional space in order to form 
a disulfide bond but they don’t have to be close together in the amino acid sequence of the 
polypeptide chain. They may even be found in different polypeptide chains. Disulfide 
bonds, or disulfide bridges, may stabilize the three-dimensional structures of some pro- 
teins by covalently cross-linking cysteine residues in peptide chains. Most proteins do not 
contain disulfide bridges because conditions inside the cell do not favor oxidation; 
however, many secreted, or extracellular, proteins contain disulfide bridges. 

D. Side Chains with Alcohol Groups 

Serine (Ser, S) and threonine (Thr, T) have uncharged polar side chains containing 
/3-hydroxyl groups. These alcohol groups give a hydrophilic character to the aliphatic 

©NH 3 

© i © 

^OOC — CH — CH 2 — SH + HS — CH 2 — CH — COO^ 
©NH 3 

Cysteine Cysteine 




©nh 3 

G OOC — CH — CH 2 — s — s — CH 2 — CH — COO 0 + 2 H® 

©NH 3 


▲ Figure 3.4 

Formation of cystine. When oxidation links the sulfhydryl groups of two cysteine molecules, the re- 
sulting compound is a disulfide called cystine. 

3.2 Structures of the 20 Common Amino Acids 61 


The RS system of configurational nomenclature is also some- 
times used to describe the chiral centers of amino acids. The 
RS system is based on the assignment of a priority sequence 
to the four groups bound to a chiral carbon atom. Once as- 
signed, the group priorities are used to establish the configu- 
ration of the molecule. Priorities are numbered 1 through 
4 and are assigned to groups according to the following rules: 

1. For atoms directly attached to the chiral carbon, the one 
with the lowest atomic mass is assigned the lowest prior- 
ity (number 4). 

2. If there are two identical atoms bound to the chiral car- 
bon, the priority is decided by the atomic mass of the 
next atoms bound. For example, a — CH 3 group has a 
lower priority than a — CH 2 Br group because hydrogen 
has a lower atomic mass than bromine. 

3. If an atom is bound by a double or triple bond, the atom 
is counted once for each formal bond. Thus, — CHO, 
with a double-bonded oxygen, has a higher priority than 

— CH 2 OH. The order of priority for the most common 
groups, from lowest to highest, is — H, — CH 3 , 
— C 6 H 5 , — CH 2 OH, —CHO, — COOH, — COOR, 
— NH 2 , — NHR, —OH, —OR, and — SH. 

With these rules in mind, imagine the molecule as the 
steering wheel of a car, with the group of lowest priority 
(numbered 4) pointing away from you (like the steering col- 
umn) and the other three groups arrayed around the rim of 
the steering wheel. Trace the rim of the wheel, moving from 
the group of highest priority to the group of lowest priority 
(1, 2, 3). If the movement is clockwise, the configuration is R 
(from the Latin rectus , “right-handed”). If the movement is 
counterclockwise, the configuration is S (from the Latin, 
sinister , “left-handed”). The figure demonstrates the assign- 
ment of S configuration to L-serine by the RS system. 
l- Cysteine has the opposite configuration, R. The dl system 
is used more often in biochemistry because not all amino 
acids found in proteins have the same RS designation. 

◄ Assignment of configuration by the RS 
system, (a) Each group attached to a chiral 
carbon is assigned a priority based on atomic 
mass, 4 being the lowest priority, (b) By orient- 
ing the molecule with the priority 4 group 
pointing away (behind the chiral carbon) and 
tracing the path from the highest priority group 
to the lowest, the absolute configuration can 
be established. If the sequence 1, 2, 3 is 
clockwise, the configuration is R. If the se- 
quence 1, 2, 3 is counterclockwise, the config- 
uration is S. L-Serine has the S configuration. 

side chains. Unlike the more acidic phenolic side chain of tyrosine the hydroxyl groups 
of serine and threonine have the weak ionization properties of primary and secondary 
alcohols. The hydroxymethyl group of serine ( — CH 2 OH) does not appreciably ionize 
in aqueous solutions; nevertheless, this alcohol can react within the active sites of a 
number of enzymes as though it were ionized. Threonine, like isoleucine, has two chiral 
centers — the a- and /3-carbon atoms. L-Threonine is the only one of the four stereoiso- 
mers that commonly occurs in proteins. (The other stereoisomers are called D-threo- 
nine, L-allothreonine, and D-allothreonine.) 

E. Positively Charged R Groups 

Histidine (His, H), lysine (Lys, K), and arginine (Arg, R) have hydrophilic side chains 
that are nitrogenous bases. The side chains can be positively charged at physiological 

The side chain of histidine contains an imidazole ring substituent. The proto- 
nated form of this ring is called an imidazolium ion (Section 3.4). At pH 7 most his- 
tidines are neutral (base form) as shown in the accompanying figure but the form 
with a positively charged side chain is present and it becomes more common at 
slightly lower pH. 

Lysine is a diamino acid with both a- and e-amino groups. The e-amino group 
exists as an alkylammonium ion ( — CH 2 — NH 3 ©) at neutral pH and confers a posi- 
tive charge on proteins. Arginine is the most basic of the 20 amino acids because its 


© 1 

H 3 N — C — H 







Histidine [H] 

H 3 N — C — H 



ch 2 



ch 2 


H 3 N — C — H 


ch 2 


ch 2 

cn 2 

©nh 3 

ch 2 

Lysine [K] 




/ c %© 

h 2 n nh 2 

Arginine [R] 

62 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 



© 1 

H 3 N— c — H 

© 1 

H,N— C — H 


ch 2 

cn 2 

cn 2 



coo u 

Aspartate [D] 

Glutamate [E] 



COO 0 

COO 0 

© 1 

H 3 N — C — H 

© I 



/ \ 

/ % 

H 2 N 0 

H 2 N 0 

Asparagine [N] 

Glutamine [Q] 



Table 3.1 Hydropathy scale 

Amino acid 

Free energy 
change of transfer" 
(kj mol 1 ) 

Highly hydrophobic 











Less hydrophobic 


^.S h 















Highly hydrophilic 















°The free-energy change is for transfer of an 
amino acid residue from the interior of a lipid bi- 
layer to water. 

b On other scales, tryptophan has a lower hy- 
dropathy value. 

[Adapted from Eisenberg, D., Weiss, R. M., Ter- 
williger, T. C., Wilcox, W. (1982). Hydrophobic 
moments in protein structure. Faraday Symp. 
Chem. Soc. 17:109-120.] 

side-chain guanidinium ion is protonated under all conditions normally found within a 
cell. Arginine side chains also contribute positive charges in proteins. 

F. Negatively Charged R Groups and Their Amide Derivatives 

Aspartate (Asp, D) and glutamate (Glu, E) are dicarboxylic amino acids and have nega- 
tively charged hydrophilic side chains at pH 7. In addition to a-carboxyl groups, aspar- 
tate possesses a /3-carboxyl group and glutamate possesses a y-carboxyl group. Aspar- 
tate and glutamate confer negative charges on proteins because their side chains are 
ionized at pH 7. Aspartate and glutamate are sometimes called aspartic acid and glu- 
tamic acid but under most physiological conditions they are found as the conjugate 
bases and, like other carboxylates, have the suffix -ate. Glutamate is probably familiar as 
its monosodium salt, monosodium glutamate (MSG), which is used in food as a flavor 

Asparagine (Asn, N) and glutamine (Gin, Q) are the amides of aspartic acid and 
glutamic acid, respectively. Although the side chains of asparagine and glutamine are 
uncharged these amino acids are highly polar and are often found on the surfaces of 
proteins where they can interact with water molecules. The polar amide groups of as- 
paragine and glutamine can also form hydrogen bonds with atoms in the side chains of 
other polar amino acids. 

G. The Hydrophobicity of Amino Acid Side Chains 

The various side chains of amino acids range from highly hydrophobic, through weakly 
polar, to highly hydrophilic. The relative hydrophobicity or hydrophilicity of each 
amino acid is called its hydropathy. 

There are several ways of measuring hydropathy, but most of them rely on calculat- 
ing the tendency of an amino acid to prefer a hydrophobic environment over a hy- 
drophilic environment. A commonly used hydropathy scale is shown in Table 3.1. 
Amino acids with highly positive hydropathy values are considered hydrophobic 
whereas those with the largest negative values are hydrophilic. It is difficult to determine 
the hydropathy values of some amino acid residues that lie near the center of the scale. 
For example, there is disagreement over the hydropathy of the indole group of trypto- 
phan and in some tables tryptophan has a much lower hydropathy value. Conversely, 
cysteine can have a higher hydropathy value in some tables. 

Hydropathy is an important determinant of protein folding because hydrophobic 
side chains tend to be clustered in the interior of a protein and hydrophilic residues 
are usually found on the surface (Section 4.10). However, it is not yet possible to pre- 
dict accurately whether a given residue will be found in the nonaqueous interior of a 
protein or on the solvent-exposed surface. On the other hand, hydropathy measure- 
ments of free amino acids can be successfully used to predict which segments of 
membrane-spanning proteins are likely to be embedded in a hydrophobic lipid 
bilayer (Chapter 9). 

3.3 Other Amino Acids and Amino Acid Derivatives 

More than 200 different amino acids are found in living organisms. In addition to 
the 20 common amino acids covered in the previous section there are three others 
that are incorporated into proteins during protein synthesis. The 21st amino acid is 
N-formylmethionine which serves as the initial amino acid during protein synthesis in 
bacteria (Section 22.5). The 22nd amino acid is selenocysteine which contains selenium 
in place of the sulfur of cysteine. It is incorporated into a few proteins in almost every 
species. Selenocysteine is formed from serine during protein synthesis. The 23rd amino 
acid is pyrrolysine, found in some species of archaebacteria. Pyrrolysine is a modified 
form of lysine that is synthesized before being added to a growing polypeptide chain by 
the translation machinery. 

N-formylmethionine, selenocysteine, and pyrrolysine are incorporated at specific 
codons and that’s why they are considered additions to the standard repertoire of pro- 
tein precursors. Because of post-translational modifications many complete proteins 
have more than the standard 23 amino acids used in protein synthesis (see below). 

3.4 Ionization of Amino Acids 63 


u ooc— ch 2 — ch 2 — ch 2 — nh 3 

y-Ami nobutyrate 

(b) © 






i © 

CH — CH 2 — NH 2 — CH 3 



Thyroxine / Triiodothyronine 

▲ Figure 3.5 

Compounds derived from common amino acids, (a) y-Ami nobutyrate. a derivative of glutamate, 
(b) Histamine, a derivative of histidine, (c) Epinephrine, a derivative of tyrosine, (d) Thyroxine 
and triiodothyronine, derivatives of tyrosine. Thyroxine contains one more atom of iodine (in 
parentheses) than does triiodothyronine. 

In addition to the common 23 amino acids that are incorporated into proteins, all 
species contain a variety of L-amino acids that are either precursors of the common 
amino acids or intermediates in other biochemical pathways. Examples are homocys- 
teine, homoserine, ornithine, and citrulline (see Chapter 17). S-Adenosylmethionine 
(SAM) is a common methyl donor in many biochemical pathways (Section 7.2). Many 
species of bacteria and fungi synthesize D-amino acids that are used in cell walls and in 
complex peptide antibiotics such as actinomycin. 

Several common amino acids are chemically modified to produce biologically im- 
portant amines. These are synthesized by enzyme -catalyzed reactions that include de- 
carboxylation and deamination. In the mammalian brain, for example, glutamate is 
converted to the neurotransmitter y-aminobutyrate (GABA) (Figure 3.5a). Mammals 
can also synthesize histamine (Figure 3.5b) from histidine. Histamine controls the con- 
striction of certain blood vessels and also the secretion of hydrochloric acid by the 
stomach. In the adrenal medulla, tyrosine is metabolized to epinephrine, also known as 
adrenaline (Figure 3.5c). Epinephrine and its precursor, norepinephrine (a compound 
whose amino group lacks a methyl substituent), are hormones that help regulate me- 
tabolism in mammals. Tyrosine is also the precursor of the thyroid hormones thyroxine 
and triiodothyronine (Figure 3.5d). Biosynthesis of the thyroid hormones requires io- 
dide. Small amounts of sodium iodide are commonly added to table salt to prevent goi- 
ter, a condition of hypothyroidism caused by a lack of iodide in the diet. 

Some amino acids are chemically modified after they have been incorporated into 
polypeptides. In fact, there are hundreds of known post-translational modifications. 
For example, some proline residues in the protein collagen are oxidized to form hydrox- 
yproline residues (Section 4.1 1). Another common modification is the addition of com- 
plex carbohydrate chains — a process known as glycosylation (Chapters 8 and 22). Many 
proteins are phosphorylated, usually by the addition of phosphoryl groups to the side 
chains of serine, threonine, or tyrosine (histidine, lysine, cysteine, aspartate, and gluta- 
mate can also be phosphorylated). The oxidation of pairs of cysteine residues to form 
cystine also occurs after a polypeptide has been synthesized. 

3.4 Ionization of Amino Acids 

The physical properties of amino acids are influenced by the ionic states of the a-carboxyl 
and a-amino groups and of any ionizable groups in the side chains. Each ionizable 
group is associated with a specific p K a value that corresponds to the pH at which the 



C— N— C— H 

A {„, 



ch 3 



© I 

H 3 N — C — H 



coo 0 

© I 

H 3 N — C — H 


64 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 



probably from aldehyde + “an” (for con- 
venience) + amine (1849) 


side chain is a sulfur (Greek theion ) atom 
with a methyl group (1928) 


crystallizes as a silver salt, from Latin 


alanine with a phenyl group (1883) 

argentum (silver) (1886) 


a corrupted form of “pyrrolidine” because 


first isolated from asparagus (1813) 

it forms a pyrrolidine ring (1904) 


similar to asparagine (1836) 


from the Latin sericum (silk), serine is com- 


first identified in the plant protein gluten 

mon in silk (1865) 



similar to the four- carbon sugar threose 


similar to glutamate (1866) 



from the Greek glykys (sweet), tastes sweet 


isolated from a tryptic digest of protein 1 
Greek phanein (to appear) (1890) 


from the Greek kystis (bladder), discovered 
in bladder stones (1882) 


found in cheese, from the Greek tyros 
(cheese) (1890) 


first isolated from sturgeon sperm, named 
for the Greek histidin (tissue) (1896) 


derivative of valeric acid from the plant 
genus Valeriana (1906) 


isomer of leucine 

Sources: Oxford English Dictionary 2nd ed., and Leung, S.H. (2000) Amino 



from the Greek leukos (white), forms white 
crystals (1820) 

product of protein hydrolysis, from the 
Greek lysis (loosening) (1891) 

acids, aromatic compounds, and carboxylic acids: how did they get their 
common names? /. Chem. Educ. 77: 48-49. 


For every acid-base pair the p/fa is the pH 
at which the concentrations of the two 
forms are equal. 

concentrations of the protonated and unprotonated forms are equal (Section 2.9). 
When the pH of the solution is below the p K a the protonated form predominates and 
the amino acid is then a true acid that is capable of donating a proton. When the pH of 
the solution is above the p K a of the ionizable group the unprotonated form of that 
group predominates and the amino acid exists as the conjugate base, which is a proton 
acceptor. Every amino acid has at least two p K a values corresponding to the ionization 
of the ct-carboxyl and a-amino groups. In addition, seven of the common amino acids 
have ionizable side chains with additional, measurable p K a values. These values differ 
among the amino acids. Thus, at a given pH, amino acids frequently have different net 
charges. Many of the modified amino acids have additional ionizable groups contribut- 
ing to the diversity of charged amino acid side chains in proteins. Phosphoserine and 
phosphotyrosine, for example, will be negatively charged. 

Knowing the ionic states of amino acid side chains is important for two reasons. 
First, the charged state influences protein folding and the three-dimensional structure of 
proteins (Section 4.10). Second, an understanding of the ionic properties of amino acids 
in the active site of an enzyme helps one understand enzyme mechanisms (Chapter 6). 

The pK a values of amino acids are determined from titration curves such as those 
we saw in the previous chapter. The titration of alanine is shown in Figure 3.6. Alanine 
has two ionizable groups — the a -carboxyl and the protonated a -amino group. As more 
base is added to the solution of acid, the titration curve exhibits two pK a values, at pH 
2.4 and pH 9.9. Each pK a value is associated with a buffering zone where the pH of the 
solution changes relatively little when more base is added. 

The pK a of an ionizable group corresponds to a midpoint of its titration curve. It is 
the pH at which the concentration of the acid form (proton donor) exactly equals the 
concentration of its conjugate base (proton acceptor). In the example shown in Figure 3.6 
the concentrations of the positively charged form of alanine and of the zwitterion are 
equal at pH 2.4. 

CH 3 ch 3 

i i 

©nh 3 — ch— cooh^^©nh 3 — ch— COO 0 + H© 

( 3 . 1 ) 

3.4 Ionization of Amino Acids 65 


H 2 N — CH — COO 


H © 

H © 



H 3 N — CH — COO 


H © 

H © 

CH 3 

© I 

H 3 N — CH — COOH 


◄ Figure 3.6 

Titration curve for alanine. The first p K a value 
is 2.4; the second is 9.9. pl A i a represents 
the isoelectric point of alanine. 

At pH 9.9 the concentration of the zwitterion equals the concentration of the nega- 
tively charged form. 

CH 3 ch 3 

i I 

©NH3 — CH — COO© NH 2 — CH — COO© + H© 



The ionic state of a particular amino acid 
side chain is determined by its p K a value 
and the pH of the local environment. 

Note that in the acid-base pair shown in the first equilibrium (Reaction 3.1) the 
zwitterion is the conjugate base of the acid form of alanine. In the second acid-base pair 
(Reaction 3.2) the zwitterion is the proton donor, or conjugate acid, of the more basic 
form that predominates at higher pH. 

One can deduce that the net charge on alanine molecules at pH 2.4 averages +0.5 
because there are equal amounts of neutral zwitterion (+/-) and cation (+). The net 
charge at pH 9.9 averages -0.5. Midway between pH 2.4 and pH 9.9, at pH 6.15, the av- 
erage net charge on alanine molecules in solution is zero. For this reason, pH 6.15 is re- 
ferred to as the isoelectric point (pi), or isoelectric pH, of alanine. If alanine were placed 
in an electric field at a pH below its pi it would carry a net positive charge (in other 
words, its cationic form would predominate), and it would therefore migrate toward the 
cathode (the negative electrode). At a pH higher than its pi alanine would carry a net 
negative charge and would migrate toward the anode (the positive electrode). At its iso- 
electric point (pH = 6.15) alanine would not migrate in either direction. 

Histidine contains an ionizable side chain. The titration curve for histidine contains 
an additional inflection point that corresponds to the p K a of its side chain (Figure 3.7a). 

v Figure 3.7 

Ionization of histidine, (a) Titration curve for 
histidine. The three p K a values are 1.8, 6.0, 
and 9.3. pi H ii S represents the isoelectric 
point of histidine, (b) Deprotonation of the 
imidazolium ring of the side chain of 



© I 

H 3 N — C — H 



Imidazolium ion 
(protonated form) 
of histidine side chain 


H © 


© I 

H 3 N — C — H 

CH 2 

<+ n: 


(deprotonated form) 
of histidine side chain 

66 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

Table 3.2 p K a values of acidic and basic 
constituents of free amino acids 
at 25°C 

Amino acid 

p/fa value 






















































Aspartic acid 




Glutamic acid 
















As is the case with alanine, the first p (1.8) represents the ionization of the a-COOH 
carboxyl group and the most basic pi^ a value (9.3) represents the ionization of the a- 
amino group. The middle p K a (6.0) corresponds to the deprotonation of the imida- 
zolium ion of the side chain of histidine (Figure 3.7b). At pH 7.0 the ratio of imidazole 
(conjugate base) to imidazolium ion (conjugate acid) is 10:1. Thus, the protonated and 
neutral forms of the side chain of histidine are both present in significant concentra- 
tions near physiological pH. A given histidine side chain in a protein may be either pro- 
tonated or unprotonated depending on its immediate environment within the protein. 
In other words, the actual p K a value of the side-chain group may not be the same as its 
value for the free amino acid in solution. This property makes the side chain of histidine 
ideal for the transfer of protons within the catalytic sites of enzymes. (A famous exam- 
ple is described in Section 6.7c.) 

The isoelectric point of an amino acid that contains only two ionizable groups (the 
a-amino and the a-carboxyl groups) is the arithmetic mean of its two pfC a values (i.e., 
pi = (pKi + pK 2 )/2). However, for an amino acid that contains three ionizable groups, 
such as histidine, one must assess the net charge of each ionic species. The isoelectric 
point for histidine lies between the pFC a values on either side of the species with no net 
charge, that is, midway between 6.0 and 9.3, or 7.65. 

As shown in Table 3.2 the p fC a values of the a-carboxyl groups of free amino acids 
range from 1.8 to 2.5. These values are lower than those of typical carboxylic acids such 
as acetic acid (p K a = 4.8) because the neighboring — NH 3 © group withdraws electrons 
from the carboxylic acid group and this favors the loss of a proton from the ct-carboxyl 
group. The side chains, or R groups, also influence the piC a value of the a - carboxyl 
group which is why different amino acids have different p K a values. (We have just seen 
that the values for histidine and alanine are not the same.) 

The a-COOH group of an amino acid is a weak acid. We can use the 
Henderson-Hasselbalch equation (Section 2.9) to calculate the fraction of the group 
that is ionized at any given pH. 

pH = p K a + log 

[proton acceptor] 
[proton donor] 


For a typical amino acid whose cr-COOH group has a p K a of 2.0, the ratio of pro- 
ton acceptor (carboxylate anion) to proton donor (carboxylic acid) at pH 7.0 can be 
calculated using the Henderson-Hasselbalch equation. 

7.0 = 2.0 + 

[RCOO 0 ] 
° 9 [RCOOH] 


In this case, the ratio of carboxylate anion to carboxylic acid is 100,000:1. This 
means that under the conditions normally found inside a cell the carboxylate anion is 
the predominant species. 

The a-amino group of a free amino acid can exist as a free amine, — NH 2 (proton ac- 
ceptor) or as a protonated amine, — NH 3 © (proton donor). The p fC a values range from 
8.7 to 10.7 as shown in Table 3.2. For an amino acid whose a-amino group has a p K a value 
of 10.0 the ratio of proton acceptor to proton donor is 1:1000 at pH 7.0. In other words, 
under physiological conditions the a - amino group is mostly protonated and positively 
charged. These calculations verify our earlier statement that free amino acids exist pre- 
dominantly as zwitterions at neutral pH. They also show that it is inappropriate to draw 
the structure of an amino acid with both — CO OH and — NH groups since there is no 
pH at which a significant number of molecules contain a protonated carboxyl group and 
an unprotonated amino group (see Problem 19). Note that the secondary amino group of 
proline (p K a = 10.6) is also protonated at neutral pH so proline — despite the bonding of 
the side chain to the a -amino group — is also zwitter ionic at pH 7. 

The seven standard amino acids with readily ionizable groups in their side chains 
are aspartate, glutamate, histidine, cysteine, tyrosine, lysine, and arginine. Ionization of 
these groups obeys the same principles as ionization of the ct-carboxyl and a-amino 
groups and the Henderson-Hasselbalch equation can be applied to each ionization. The 
ionization of the y-carboxyl group of glutamate (p K a = 4.1) is shown in Figure 3.8a. 

3.5 Peptide Bonds Link Amino Acids in Proteins 67 




h 3 n- 




p ch 2 


?ch 2 

cU ^OH 


pK a = 4.1 


Carboxylic acid 
(protonated form) 
of glutamate side chain 




H 3 N — C — H 


0ch 2 


y ch 2 


Carboxylate ion 
(deprotonated form) 
of glutamate side chain 



H 2 N — C — H 

H 2 N — C — H 


cn 2 

H © 



cn 2 

|h 2 

-Z — > 

p/C a = 12.5 

|h 2 

l H2 

‘ ^ 

cn 2 






H 2 N ' © ' nh 2 

HN^ NH 2 

Guanidinium ion 

Guanidine group 

(protonated form) 

(deprotonated form) 

of arginine side chain 

of arginine side chain 

▲ Figure 3.8 

Ionization of amino acid side chains, (a) Ionization of the protonated y-carboxyl group of glutamate. 
The negative charge of the carboxylate anion is delocalized, (b) Deprotonation of the guanidinium 
group of the side chain of arginine. The positive charge is delocalized. 

Note that the y-carboxyl group is further removed from the influence of the a-ammo- 
nium ion and behaves as a weak acid with a piC a of 4.1. This makes it similar in strength 
to acetic acid (pFC a = 4.8) whereas the ct-carboxyl group is a stronger acid (pFC a = 2.1). 
Figure 3.8b shows the deprotonation of the guanidinium group of the side chain of argi- 
nine in a strongly basic solution. Charge delocalization stabilizes the guanidinium ion 
contributing to its high p iC a value of 12.5. 

As mentioned earlier, the pFC a values of ionizable side chains in proteins can differ 
from those of the free amino acids. Two factors cause this perturbation of ionization 
constants. First, a-amino and a-carboxyl groups lose their charges once they are linked 
by peptide bonds in proteins — consequently, they exert weaker inductive effects on 
their neighboring side chains. Second, the position of an ionizable side chain within the 
three dimensional structure of a protein can affect its p K a . For example, the enzyme 
ribonuclease A has four histidine residues but the side chain of each residue has 
a slightly different p K a as a result of differences in their immediate surroundings, or 

3.5 Peptide Bonds Link Amino Acids in Proteins 

The linear sequence of amino acids in a polypeptide chain is called the primary structure 
of a protein. Higher levels of structure are referred to as secondary, tertiary, and quater- 
nary. The structure of proteins is covered more thoroughly in the next chapter but it’s 
important to understand peptide bonds and primary structure before discussing some 
of the remaining topics in this chapter. 

The linkage formed between amino acids is an amide bond called a peptide bond 
(Figure 3.9). This linkage can be thought of as the product of a simple condensation re- 
action between the ct-carboxyl group of one amino acid and the a-amino group of an- 
other. A water molecule is lost from the condensing amino acids in the reaction. (Recall 
from Section 2.6 that such simple condensation reactions are extremely unfavorable in 
aqueous solutions due to the huge excess of water molecules. The actual pathway of 
protein synthesis involves reactive intermediates that overcome this limitation.) Unlike 
the carboxyl and amino groups of free amino acids in solution the groups involved in 
peptide bonds carry no ionic charges. 

Linked amino acids in a polypeptide chain are called amino acid residues. The 
names of residues are formed by replacing the ending -ine or -ate with -yl. For example, 
a glycine residue in a polypeptide is called glycyl and a glutamate residue is called glutamyl. 

The structure of peptide bonds is 
described in Section 4.3. 

Protein synthesis (translation) is 
described in Chapter 22. 

68 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

Figure 3.9 ► 

Peptide bond between two amino acids. The 

structure of the peptide linkage can be 
viewed as the product of a condensation 
reaction in which the a-carboxyl group of 
one amino acid condenses with the a-amino 
group of another amino acid. The result is a 
dipeptide in which the amino acids are 
linked by a peptide bond. Here, alanine is 
condensed with serine to form alanylserine. 


ch 2 oh 


H 3 N — CH — COO° + H 3 N — CH — COO' 




ch 3 o 



N -terminus H 3 N — CH — C — N — CH — COO° C- terminus 


Peptide bond 


NH 3 

H — C— CH 2 — COO° 

c =o 




ch 3 

▲ Figure 3.10 

Aspartame (aspartylphenylalanine methyl 

In the cases of asparagine, glutamine, and cysteine, -yl replaces the final -e to form as- 
paraginyl, glutaminyl, and cysteinyl, respectively. The -yl ending indicates that the 
residue is an acyl unit (a structure that lacks the hydroxyl of the carboxyl group). The 
dipeptide in Figure 3.9 is called alanylserine because alanine is converted to an acyl unit 
but the amino acid serine retains its carboxyl group. 

The free amino group and free carboxyl group at the opposite ends of a peptide 
chain are called the N- terminus (amino terminus) and the C-terminus (carboxyl termi- 
nus), respectively. At neutral pH each terminus carries an ionic charge. By convention, 
amino acid residues in a peptide chain are numbered from the N-terminus to the 
C-terminus and are usually written from left to right. This convention corresponds to 
the direction of protein synthesis (Section 22.6). Synthesis begins with the N-terminal 
amino acid — almost always methionine (Section 22.5) — and proceeds sequentially to- 
ward the C-terminus by adding one residue at a time. 

Both the standard three-letter abbreviations for the amino acids (e.g., 
Gly-Arg-Phe-Ala-Lys) and the one-letter abbreviations (e.g., GRFAK) are used to de- 
scribe the sequence of amino acid residues in peptides and polypeptides. It’s important 
to know both abbreviation systems. The terms dipeptide , tripeptide , oligopeptide , and 
polypeptide refer to chains of two, three, several (up to about 20), and many (usually 
more than 20) amino acid residues, respectively. A dipeptide contains one peptide 
bond, a tripeptide contains two peptide bonds, and so on. As a general rule, each 
peptide chain, whatever its length, possesses one free a-amino group and one free 
a-carboxyl group. (Exceptions include covalently modified terminal residues and circu- 
lar peptide chains.) Note that the formation of a peptide bond eliminates the ioniz- 
able a-carboxyl and a-amino groups found in free amino acids. As a result, most of the 
ionic charges associated with a protein molecule are contributed by the side chains of 
the amino acids. This means that the solubility and ionic properties of a protein are 
largely determined by its amino acid composition. Furthermore, the side chains 
of the residues interact with each other and these interactions contribute to the three 
dimensional shape and stability of a protein molecule (Chapter 4). 

Some peptides are important biological compounds and the chemistry of peptides 
is an active area of research. Several hormones are peptides; for example, endorphins 
are the naturally occurring molecules that modulate pain in vertebrates. Some very sim- 
ple peptides are useful as food additives; for example, the sweetening agent aspartame is 
the methyl ester of aspartylphenylalanine (Figure 3.10). Aspartame is about 200 times 
sweeter than table sugar and is widely used in diet drinks. There are also many peptide 
toxins such as those found in snake venom and poisonous mushrooms. 

3.6 Protein Purification Techniques 

In order to study a particular protein in the laboratory it must be separated from all other 
cell components including other, similar proteins. Few analytical techniques will work 
with crude mixtures of cellular proteins because they contain hundreds (or thousands) of 
different proteins. The purification steps are different for each protein. They are worked 

3.6 Protein Purification Techniques 69 

out by trying a number of different techniques until a procedure is developed that repro- 
ducibly yields highly purified protein that is still biologically active. Purification steps usu- 
ally exploit minor differences in the solubilities, net charges, sizes, and binding specificities 
of proteins. In this section, we consider some of the common methods of protein purifica- 
tion. Most purification techniques are performed at 0°C to 4°C to minimize temperature- 
dependent processes such as protein degradation and denaturation (unfolding). 

The first step in protein purification is to prepare a solution of proteins. The source 
of a protein is often whole cells in which the target protein accounts for less than 0.1% 
of the total dry weight. Isolation of an intracellular protein requires that cells be sus- 
pended in a buffer solution and homogenized, or disrupted into cell fragments. Under 
these conditions most proteins dissolve. (Major exceptions include membrane proteins 
which require special purification procedures.) Let’s assume that the desired protein is 
one of many proteins in this solution. 

One of the first steps in protein purification is often a relatively crude separation 
that makes use of the different solubilities of proteins in salt solutions. Ammonium sul- 
fate is frequently used in such fractionations. Enough ammonium sulfate is mixed with 
the solution of proteins to precipitate the less soluble impurities, which are removed by 
centrifugation. The target protein and other more soluble proteins remain in the fluid 
called the supernatant fraction. Next, more ammonium sulfate is added to the super- 
natant fraction until the desired protein is precipitated. The mixture is centrifuged, the 
fluid removed, and the precipitate dissolved in a minimal volume of buffer solution. 
Typically, fractionation using ammonium sulfate gives a two- to threefold purification 
(i.e., one-half to two-thirds of the unwanted proteins have been removed from the re- 
sulting enriched protein fraction). At this point the solvent containing residual ammo- 
nium sulfate is exchanged by dialysis for a buffer solution suitable for chromatography. 

In dialysis, a protein solution is sealed in a cylinder of cellophane tubing and sus- 
pended in a large volume of buffer. The cellophane membrane is semipermeable — high 
molecular weight proteins are too large to pass through the pores of the membrane so 
proteins remain inside the tubing while low molecular weight solutes (including, in this 
case, ammonium and sulfate ions) diffuse out and are replaced by solutes in the buffer. 

Column chromatography is often used to separate a mixture of proteins. A cylindrical 
column is filled with an insoluble material such as substituted cellulose fibers or syn- 
thetic beads. The protein mixture is applied to the column and washed through the ma- 
trix of insoluble material by the addition of solvent. As solvent flows through the col- 
umn the eluate (the liquid emerging from the bottom of the column) is collected in 
many fractions, a few of which are represented in Figure 3.1 la. The rate at which pro- 
teins travel through the matrix depends on interactions between matrix and protein. 
For a given column different proteins are eluted at different rates. The concentration of 
protein in each fraction can be determined by measuring the absorbance of the eluate at 
a wavelength of 280 nm (Figure 3.11b). (Recall from Section 3.2B that at neutral pH, 
tyrosine and tryptophan absorb UV light at 280 nm.) To locate the target protein the 
fractions containing protein must then be assayed, or tested, for biological activity or 
some other characteristic property. Column chromatography may be performed under 
high pressure using small, tightly packed columns with solvent flow controlled by a 
computer. This technique is called HPLC, for high-performance liquid chromatography. 

Chromatographic techniques are classified according to the type of matrix. In ion- 
exchange chromatography the matrix carries positive charges (anion -exchange resins) or 
negative charges (cation -exchange resins). Anion- exchange matrices bind negatively 
charged proteins retaining them in the matrix for subsequent elution. Conversely, cation- 
exchange materials bind positively charged proteins. The bound proteins can be serially 
eluted by gradually increasing the salt concentration in the solvent. As the salt concentra- 
tion is increased it eventually reaches a concentration where the salt ions outcompete pro- 
teins in binding to the matrix. At this concentration the protein is released and is collected 
in the eluate. Individual bound proteins are eluted at different salt concentrations and this 
fractionation makes ion-exchange chromatography a powerful tool in protein purification. 

Gel-filtration chromatography separates proteins on the basis of molecular size. The 
gel is a matrix of porous beads. Proteins that are smaller than the average pore size 

▲ There is only one correct way to write the 
sequence of a polypeptide- from N-teminus 
to C-terminus. 

▲ Green mamba ( Dendroapsis angusticeps). 

One of the toxins in the venom of this poi- 
sonous snake is a large peptide with the 
CCRSDKCNE [Viljoen and Botes (1974). 
J.Biol.Chem. 249:366] 

70 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

Figure 3.1 1 ► 

Column chromatography, (a) A mixture of 
proteins is added to a column containing a 
solid matrix. Solvent then flows into the col- 
umn from a reservoir. Washed by solvent, 
different proteins (represented by red and 
blue bands) travel through the column at 
different rates, depending on their interac- 
tions with the matrix. Eluate is collected in 
a series of fractions, a few of which are 
shown, (b) The protein concentration of 
each fraction is determined by measuring 
the absorbance at 280 nm. The peaks corre- 
spond to the elution of the protein bands 
shown in (a). The fractions are then tested 
for the presence of the target protein. 




Fractions collected sequentially 

▲ Atypical high-performance liquid chro- 
matography (HPLC) system in a research lab 
(left). The large instrument on the right is a 
mass spectrometer (Istituto di Ricerche 
Farmacologiche, Milan, Italy) 

penetrate much of the internal volume of the beads and are therefore retarded by the 
matrix as the buffer solution flows through the column. The smaller the protein, the 
later it elutes from the column. Fewer of the pores are accessible to larger protein mole- 
cules. Consequently, the largest proteins flow past the beads and elute first. 

Affinity chromatography is the most selective type of column chromatography. It re- 
lies on specific binding interactions between the target protein and some other mole- 
cule that is covalently bound to the matrix of the column. The molecule bound to the 
matrix may be a substance or a ligand that binds to a protein in vivo , an antibody that 
recognizes the target protein, or another protein that is known to interact with the tar- 
get protein inside the cell. As a mixture of proteins passes through the column only the 
target protein specifically binds to the matrix. The column is then washed with buffer 
several times to rid it of nonspecifically bound proteins. Finally, the target protein can 
be eluted by washing the column with a solvent containing a high concentration of salt 
that disrupts the interaction between the protein and column matrix. In some cases, 
bound protein can be selectively released from the affinity column by adding excess lig- 
and to the elution buffer. The target protein preferentially binds to the ligand in solu- 
tion instead of the lower concentration of ligand that is attached to the insoluble matrix 
of the column. This method is most effective when the ligand is a small molecule. Affin- 
ity chromatography alone can sometimes purify a protein 1000- to 10,000-fold. 

3.7 Analytical Techniques 

Electrophoresis separates proteins based on their migration in an electric field. In 
polyacrylamide gel electrophoresis (PAGE) protein samples are placed on a highly cross- 
linked gel matrix of polyacrylamide and an electric field is applied. The matrix is 

3.7 Analytical Techniques 71 

buffered to a mildly alkaline pH so that most proteins are anionic and migrate toward 
the anode. Typically, several samples are run at once together with a reference sample. 
The gel matrix retards the migration of large molecules as they move in the electric 
field. Hence, proteins are fractionated on the basis of both charge and mass. 

A modification of the standard electrophoresis technique uses the negatively 
charged detergent sodium dodecyl sulfate (SDS) to overwhelm the native charge on 
proteins so that they are separated on the basis of mass only. SDS-polyacrylamide gel 
electrophoresis (SDS-PAGE) is used to assess the purity and to estimate the molecular 
weight of a protein. In SDS-PAGE the detergent is added to the polyacrylamide gel as 
well as to the protein samples. A reducing agent is also added to the samples to reduce 
any disulfide bonds. The dodecyl sulfate anion, which has a long hydrophobic tail 
(CH 3 (CH 2 )ii 0 S 03 ( ^ ) , Figure 2.8) binds to hydrophobic side chains of amino acid 
residues in the polypeptide chain. SDS binds at a ratio of approximately one molecule 
for every two residues of a typical protein. Since larger proteins bind proportionately 
more SDS the charge-to-mass ratios of all treated proteins are approximately the same. 
All the SDS-protein complexes are highly negatively charged and move toward the 
anode as diagrammed in Figure 3.12a. However, their rate of migration through the gel 
is inversely proportional to the logarithm of their mass — larger proteins encounter 
more resistance and therefore migrate more slowly than smaller proteins. This sieving 
effect differs from gel-filtration chromatography because in gel filtration larger mole- 
cules are excluded from the pores of the gel and hence travel faster. In SDS-PAGE all 
molecules penetrate the pores of the gel so the largest proteins travel most slowly. The 
protein bands that result from this differential migration (Figure 3.13) can be visualized 
by staining. Molecular weights of unknown proteins can be estimated by comparing 
their migration to the migration of reference proteins on the same gel. 

Although SDS-PAGE is primarily an analytical tool, it can be adapted for purifying 
proteins. Denatured proteins can be recovered from SDS-PAGE by cutting out the 
bands of a gel. The protein is then electroeluted by applying an electric current to allow 
the protein to migrate into a buffer solution. After concentration and the removal of 
salts such protein preparations can be used for structural analysis, preparation of anti- 
bodies, or other purposes. 



Bovine serum albumin 


Carbonic anhydrase 
Soybean trypsin inhibitor 

I Aprotinin* 

5 - 

~\ 1 1 1 1 1 

1 2 3 4 5 

Distance migrated (cm) 

▲ Figure 3.13 

Proteins separated on an SDS-polyacrylamide 
gel. (a) Stained proteins after separation. The 
high molecular weight proteins are at the top 
of the gel. (b) Graph showing the relationship 
between the molecular weight of a protein 
and the distance it migrates in the gel. 

◄ Figure 3.12 

SDS-PAGE. (a) An electrophoresis apparatus 
includes an SDS-polyacrylamide gel between 
two glass plates and buffer in the upper and 
lower reservoirs. Samples are loaded into the 
wells of the gel, and voltage is applied. Be- 
cause proteins complexed with SDS are neg- 
atively charged, they migrate toward the 
anode, (b) The banding pattern of the pro- 
teins after electrophoresis can be visualized 
by staining. The smallest proteins migrate 
fastest, so the proteins of lowest molecular 
weight are at the bottom of the gel. 


CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

Mass spectrometry, as the name implies, is a technique that determines the mass of a 
molecule. The most basic type of mass spectrometer measures the time that it takes for 
a charged gas phase molecule to travel from the point of injection to a sensitive detector. 
This time depends on the charge of a molecule and its mass and the result is reported as 
the mass/charge ratio. The technique has been used in chemistry for almost 100 years 
but its application to proteins was limited because, until recently, it was not possible to 
disperse charged protein molecules into a gaseous stream of particles. 

This problem was solved in the late 1980s with the development of two new types 
of mass spectrometry. In electrospray mass spectrometry the protein solution is pumped 
through a metal needle at high voltage to create tiny droplets. The liquid rapidly evapo- 
rates in a vacuum and the charged proteins are focused on a detector by a magnetic 
field. The second new technique is called matrix-assisted laser desorption ionization 
(MALDI). In this method the protein is mixed with a chemical matrix and the mixture is 
precipitated on a metal substrate. The matrix is a small organic molecule that absorbs 
light at a particular wavelength. A laser pulse at the absorption wavelength imparts en- 
ergy to the protein molecules via the matrix. The proteins are instantly released from 
the substrate (desorbed) and directed to the detector (Figure 3.14). When time-of- flight 
(TOF) is measured, the technique is called MALDI-TOF. 

Figure 3.14 ► 

MALDI-TOF mass spectrometry, (a) A burst 
of light releases proteins from the matrix. 

(b) Charged proteins are directed toward the 
detector by an electric field, (c) The time of 
arrival at the detector depends on the mass 
and the charge of the protein. 


Metal - 





° m 

On ,■ 




° o 



Oi 0° 





3.8 Amino Acid Composition of Proteins 73 

The raw data from a mass spectrometry experiment can be quite simple as shown 
in Figure 3.14. There, a single species with one positive charge is detected so the 
mass/charge ratio gives the mass directly. In other cases the spectra can be more com- 
plicated, especially in electrospray mass spectrometry. Often there are several different 
charged species and the correct mass has to be calculated by analyzing a collection of 
molecules with charges of +1, +2, +3, etc. The spectrum can be daunting when the 
source is a mixture of different proteins. Fortunately, there are sophisticated computer 
programs that can analyze the data and calculate the correct masses. The current popu- 
larity of mass spectrometry owes as much to the development of this software as it does 
to the new hardware and new methods of sample preparation. 

Mass spectrometry is very sensitive and highly accurate. Often the mass of a protein 
can be obtained from picomole (NT 12 mol) quantities that are isolated from an 
SDS-PAGE gel. The correct mass can be determined with an accuracy of less than the 
mass of a single proton. 

3.8 Amino Acid Composition of Proteins 

Once a protein has been isolated its amino acid composition can be determined. First, 
the peptide bonds of the protein are cleaved by acid hydrolysis, typically using 6 M HC1 
(Figure 3.15). Next, the hydrolyzed mixture, or hydrolysate, is subjected to a chromato- 
graphic procedure in which each of the amino acids is separated and quantitated, a 
process called amino acid analysis. One method of amino acid analysis involves treat- 
ment of the protein hydrolysate with phenylisothiocyanate (PITC) at pH 9.0 to generate 
phenylthiocarbamoyl (PTC)-amino acid derivatives (Figure 3.16). The PTC-amino 
acid mixture is then subjected to HPLC in a column of fine silica beads to which short 
hydrocarbon chains have been attached. The amino acids are separated by the hy- 
drophobic properties of their side chains. As each PTC-amino acid derivative is eluted 
it is detected and its concentration is determined by measuring the absorbance of the 
eluate at 254 nm (the peak absorbance of the PTC moiety). Since different PTC-amino 
acid derivatives are eluted at different rates the time at which an amino acid derivative 
elutes from the column identifies the amino acid relative to known standards. The 
amount of each amino acid in the hydrolysate is proportional to the area under its peak. 
With this method, amino acid analysis can be performed on samples as small as 1 pico- 
mole of a protein that contains approximately 200 residues. 

Despite its usefulness, acid hydrolysis cannot yield a complete amino acid analysis. 
Since the side chains of asparagine and glutamine contain amide bonds the acid used to 
cleave the peptide bonds of the protein also converts asparagine to aspartic acid and 
glutamine to glutamic acid. Other limitations of the acid hydrolysis method include 
small losses of serine, threonine, and tyrosine. In addition, the side chain of tryptophan 
is almost totally destroyed by acid hydrolysis. There are several ways of overcoming 
these limitations. For example, proteins can be hydrolyzed to amino acids by enzymes 

John B. Fenn (1917-) 

Koichi Tanaka (1959-) 

▲ John B. Fenn and Koichi Tanaka were 
awarded the Nobel Prize in Chemistry in 
2002 “for their development of soft 
desorption ionisation methods for mass 
spectrometric analyses of biological 




H R 



R-, O R 2 O R 3 

0 I II I II I 

H 3 N — CH — C — N — CH — C — N — CH — COOH 

H H 

2 H 2 0 

6 M HCI 

pH = 9.0 






H 3 N— CH — COOH + H 3 N — CH — COOH + H 3 N — CH — COOH 
▲ Figure 3.15 

Acid-catalyzed hydrolysis of a peptide. Incubation with 6 M HCI at 110°C for 16 to 72 hours releases 
the constituent amino acids of a peptide. 

PTC-amino acid 
▲ Figure 3.16 

Amino acid treated with phenylisothiocyanate 
(PITC). The a-amino group of an amino acid 
reacts with phenylisothiocyanate to give a 
phenylthiocarbamoyl-amino acid 
(PTC-amino acid). 

74 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

Figure 3.17 ► 

HPLC separation of amino acids. Amino acids 
obtained from the enzymatic hydrolysis of a 
protein are treated with o-phthalaldehyde 
and separated by HPLC. 

The frequency of amino acids in pro- 
teins is correlated with the number of 
codons for each amino acid (Section 
22 . 1 ) 

Table 3.3 Amino acid compositions of 

Amino acid 

Frequency in 
proteins (%) 

Highly hydrophobic 

lie (1) 


Val (V) 


Leu (L) 


Phe (F) 


Met (M) 


Less hydrophobic 

Ala (A) 


Cly (C) 


Cys (C) 


Trp (W) 




Pro (P) 


Thr (T) 


Ser (S) 


Highly hydrophilic 

Asn (N) 


Gin (Q) 



Asp (D) 


Glu (E) 



His (H) 


Lys (K) 




Time (mm:ss) 

instead of using acid hydrolysis. The free amino acids are then attached to a chemical 
that absorbs light in the ultraviolet and the derivatized amino acids are analyzed by 
HPLC (Figure 3.17). 

Using various analytical techniques the complete amino acid compositions of 
many proteins have been determined. Dramatic differences in composition have been 
found, illustrating the tremendous potential for diversity based on different combina- 
tions of the 20 amino acids. 

The amino acid composition (and sequence) of proteins can also be determined 
from the sequence of its gene. In fact, these days it is often much easier to clone and se- 
quence DNA than it is to purify and sequence a protein. Table 3.3 shows the average fre- 
quency of amino acid residues in more than 1000 different proteins whose sequences 
are deposited in protein databases. The most common amino acids are leucine, alanine, 
and glycine, followed by serine, valine, and glutamate. Tryptophan, cysteine, and histi- 
dine are the least abundant amino acids in typical proteins. 

If you know the amino acid composition of a protein you can calculate the molec- 
ular weight using the molecular weights of the amino acids in Table 3.4. Be sure to sub- 
tract the molecular weight of one water molecule for each peptide bond (Section 3.5). You 
can get a rough estimate of the molecular weight of a protein by using the average mo- 
lecular weight of a residue (= 110). Thus, a protein of 650 amino acid residues has an 
approximate relative molecular mass of 71,500 (M r = 71,500). 

3.9 Determining the Sequence of Amino Acid Residues 

Amino acid analysis provides information on the composition of a protein but not its 
primary structure (sequence of residues). In 1950, Pehr Edman developed a technique 
that permits removal and identification of one residue at a time from the N-terminus of 
a protein. The Edman degradation procedure involves treating a protein at pH 9.0 with 
PITC, also known as the Edman reagent. (Recall that PITC can also be used in the meas- 
urement of free amino acids as shown in Figure 3.16.) PITC reacts with the free N-termi- 
nus of the chain to form a phenylthiocarbamoyl derivative, or PTC-peptide (Figure 3.18, 
on the next page). When the PTC-peptide is treated with an anhydrous acid, such as tri- 
fluoroacetic acid the peptide bond of the N-terminal residue is selectively cleaved re- 
leasing an anilinothiazolinone derivative of the residue. This derivative can be extracted 
with an organic solvent, such as butyl chloride, leaving the remaining peptide in the 
aqueous phase. The unstable anilinothiazolinone derivative is then treated with aque- 
ous acid which converts it to a stable phenylthiohydantoin derivative of the amino acid 
that had been the N-terminal residue (PTH-amino acid). The polypeptide chain in the 
aqueous phase, now one residue shorter (residue 2 of the original protein is now the N- 
terminus), can be adjusted back to pH 9.0 and treated again with PITC. The entire pro- 
cedure can be repeated serially using an automated instrument known as a sequenator. 
Each cycle yields a PTH-amino acid that can be identified chromatographically, usually 
by HPLC. 

3.9 Determining the Sequence of Amino Acid Residues 75 

The yield of the Edman degradation procedure under carefully controlled condi- 
tions approaches 100% and a few picomoles of sample protein can yield sequences of 
30 residues or more before further measurement is obscured by the increasing concen- 
tration of unrecovered sample from previous cycles of the procedure. For example, 
if the Edman degradation procedure had an efficiency of 98% the cumulative yield at 
the 30th cycle would be 0.98 30 , or 0.55. In other words, only about half of the 
PTH-amino acids generated in the 30th cycle would be derived from the 30th residue 
from the N- terminus. 

Rt O 

N = C = S + H 2 N — C — C— N 


H H 

Phenylisothiocyanate ^ Y J 

(Edman reagent) N-terminal residue 

of polypeptide 

pH = 9.0 

S Rt O O 


N — C — N — C — C — N — CH — C — N' wx ' 

I I I I I I 

H H H H R 2 H 


Table 3.4 Molecular weights of 
amino acids 

Amino acid 

M r 









































f 3 ccooh 



\ / 

S -C 

R i 


Anilinothiazolinone derivative 



H 3 N — CH — C — 

|\| WA, 

r 2 h 

Polypeptide chain with 
n-1 amino acid residues 

Aqueous acid 



Phenylthiohydantoin derivative 
of extracted N-terminal amino acid 

Amino acid identified 

Returned to alkaline conditions 
for reaction with additional 
phenylisothiocyanate in the 
next cycle of Edman degradation 

◄ Figure 3.18 

Edman degradation procedure. The N-terminal 
residue of a polypeptide chain reacts with 
phenylisothiocyanate to give a phenylthio- 
carbamoyl-peptide. Treating this derivative 
with trifluoroacetic acid (F 3 CC00H) releases 
an anilinothiazolinone derivative of the 
N-terminal amino acid residue. The 
anilinothiazolinone is extracted and treated 
with aqueous acid, which rearranges the 
derivative to a stable phenylthiohydantoin 
derivative that can then be identified 
chromatographically. The remainder of the 
polypeptide chain, whose new N-terminal 
residue was formerly in the second position, 
is subjected to the next cycle of Edman 


CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

t Figure 3.19 

Protein cleavage by cyanogen bromide (CNBr). 

Cyanogen bromide cleaves polypeptide 
chains at the C-terminal side of methionine 
residues. The reaction produces a peptidyl 
homoserine lactone and generates a new 

3.10 Protein Sequencing Strategies 

Most proteins contain too many residues to be completely sequenced by Edman degra- 
dation proceeding only from the N-terminus. Therefore, proteases (enzymes that cat- 
alyze the hydrolysis of peptide bonds in proteins) or certain chemical reagents are used 
to selectively cleave some of the peptide bonds of a protein. The smaller peptides formed 
are then isolated and subjected to sequencing by the Edman degradation procedure. 

The chemical reagent cyanogen bromide (CNBr) reacts specifically with methionine 
residues to produce peptides with C-terminal homoserine lactone residues and new 
N-terminal residues (Figure 3.19). Since most proteins contain relatively few methion- 
ine residues treatment with CNBr usually produces only a few peptide fragments. For 
example, reaction of CNBr with a polypeptide chain containing three internal methion- 
ine residues should generate four peptide fragments. Each fragment can then be se- 
quenced from its N-terminus. 

Many different proteases can be used to generate fragments for protein sequenc- 
ing. For example, trypsin specifically catalyzes the hydrolysis of peptide bonds on the 
carbonyl side of lysine and arginine residues both of which bear positively charged side 
chains (Figure 3.20a). Staphylococcus aureus V8 protease catalyzes the cleavage of pep- 
tide bonds on the carbonyl side of negatively charged residues (glutamate and aspar- 
tate); under appropriate conditions (50 mM ammonium bicarbonate), it cleaves only 
glutamyl bonds. Chymotrypsin, a less specific protease, preferentially catalyzes the hy- 
drolysis of peptide bonds on the carbonyl side of uncharged residues with aromatic or 
bulky hydrophobic side chains, such as phenylalanine, tyrosine, and tryptophan 
(Figure 3.20b). 

By judicious application of cyanogen bromide, trypsin, S. aureus V8 protease, and 
chymotrypsin to individual samples of a large protein one can generate many peptide 
fragments of various sizes. These fragments can then be separated and sequenced by 
Edman degradation. In the final stage of sequence determination the amino acid se- 
quence of a large polypeptide chain can be deduced by lining up matching sequences of 
overlapping peptide fragments as illustrated in Figure 3.20c. When referring to an 
amino acid residue whose position in the sequence is known it is customary to follow 
the residue abbreviation with its sequence number. For example, the third residue of the 
peptide shown in Figure 3.20 is called Ala-3. 

The process of generating and sequencing peptide fragments is especially impor- 
tant in obtaining information about the sequences of proteins whose N-termini are 
blocked. For example, the N-terminal a-amino groups of many bacterial proteins are 
formylated and do not react at all when subjected to the Edman degradation procedure. 
Peptide fragments with unblocked N-termini can be produced by selective cleavage and 
then separated and sequenced so that at least some of the internal sequence of the pro- 
tein can be obtained. 

For proteins that contain disulfide bonds, the complete covalent structure is not 
fully resolved until the positions of the disulfide bonds have been established. The posi- 
tions of the disulfide cross-links can be determined by fragmenting the intact protein, 
isolating the peptide fragments, and determining which fragments contain cystine 
residues. The task of determining the positions of the cross-links becomes quite compli- 
cated when the protein contains several disulfide bonds. 

© ^ (p, 

H 3 N — Gly— Arg— Phe— Ala— Lys — Met— Trp— Val— COO u 

BrCN (+ H 2 0) 

© H H 0 n n 

H 3 N — Gly— Arg— Phe— Ala — Lys— N — C x + H 3 N — Trp — Val— COO u + H 3 CSCN + + Br e 

H 2 C 

\ / 

h 2 c — o 

Peptidyl homoserine lactone 

3.10 Protein Sequencing Strategies 77 

(a) H 3 N— Gly— Arg— Ala— Ser — Phe— Gly— Asn — Lys — Trp— Glu— Val— COO° 



© (p\ © (p\ © 

H 3 N— Gly — Arg— COO^ + H 3 N — Ala — Ser — Phe — Gly — Asn — Lys — COCr^ + H 3 N — Trp — Glu — Val — COCr^ 

(b) H 3 N — Gly — Arg— Ala —Ser— Phe — Gly — Asn— Lys —Trp— Glu — Val— COO° 



© p) © (p) ® (p) 

H 3 N— Gly— Arg — Ala — Ser— Phe— COO u + H 3 N — ly — Asn — Lys — Trp— COO u + H 3 N— Glu— Val— COO u 


Gly— Arg 

Ala — Ser — Phe — Gly — Asn — Lys 

Trp — Glu — Val 

Gly— Arg— Ala— Ser— Phe 

Gly — Asn — Lys — Trp 

Glu— Val 

Deducing the amino acid sequence of a particular protein from the sequence of its 
gene (Figure 3.21) overcomes some of the technical limitations of direct analytical tech- 
niques. For example, the amount of tryptophan can be determined and aspartate and 
asparagine residues can be distinguished because they are encoded by different codons. 
However, direct sequencing of proteins is still important since it is the only way of de- 
termining whether modified amino acids are present or whether amino acid residues 
have been removed after protein synthesis is complete. 

Researchers frequently want to identify a particular unknown protein. Let’s say you 
have displayed human serum proteins on an SDS gel and you note the presence of a 
protein band at 67 KDa. What is that protein? Two recent developments have made the 
job of identifying unknown proteins much easier — sensitive mass spectrometry and 
genome sequences. Let’s see how they work. 

First, you isolate the protein by cutting out the unknown protein band and eluting 
the 67 KD protein. The next step is to digest the protein with a protease that cuts at spe- 
cific sites. Let’s say you choose trypsin, an enzyme that cleaves the peptide bond follow- 
ing arginine (R) or lysine (K) residues. After digestion with trypsin you end up with 
several dozen peptide fragments all of which end with arginine or lysine. 

Next, you subject the peptide mixture to mass spectrometry choosing a method 
such as MALDI-TOF where the precise molecular weights of the peptides can be deter- 
mined. The resulting spectrum is shown in Figure 3.22. You now have a “fingerprint” of 
the unknown protein corresponding to the molecular weights of all the trypsin diges- 
tion products. 

In many labs the technique of chemical sequencing using Edman degradation has 
been replaced by methods using the mass spectrometer. If you wanted to determine the 
sequences of each peptide shown in Figure 3.22 your next step would be to fragment 
each peptide into various sized pieces and measure the precise molecular weight of each 
fragment in the mass spectrometer. 

The data can be used to determine the sequence of the peptide. For example, take 
the tryptic peptide of M r = 1226.59 shown in Figure 3.22. One of the large pieces 
produced by fragmenting this peptide has a molecular weight of 1079.5. The difference 



i r 

i r 

n r 

^ Lys — Ser — Glu — Pro — Val^ 

▲ Figure 3.20 

Cleavage and sequencing of an oligopeptide. 

(a) Trypsin catalyzes cleavage of peptides on 
the carbonyl side of the basic residues argi- 
nine and lysine, (b) Chymotrypsin catalyzes 
cleavage of peptides on the carbonyl side of 
uncharged residues with aromatic or bulky 
hydrophobic side chains, including pheny- 
lalanine, tyrosine, and tryptophan, (c) By 
using the Edman degradation procedure to 
determine the sequence of each fragment 
(highlighted in boxes) and then lining up the 
matching sequences of overlapping frag- 
ments, one can determine the order of the 
fragments and thus deduce the sequence of 
the entire oligopeptide. 

◄ Figure 3.21 

Sequences of DNA and protein. The amino acid 
sequence of a protein can be deduced from 
the sequence of nucleotides in the correspon- 
ding gene. A sequence of three nucleotides 
specifies one amino acid. A, C, G, and T rep- 
resent the nucleotide residues of DNA. 

78 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

1657.74 1853.89 

118-130 509-524 

M r 

▲ Figure 3.22 

Tryptic fingerprint of a 67 kDa serum protein. The numbers over each peak are the mass of the 
fragment. The number below each mass refer to the residues in Figure 3.23 (Adapted from 
Detlevuvkaw, Wikipedia entry on peptide mass fingerprinting) 

▲ Frederick Sanger (191 8-) Sanger won the 
Nobel Prize in Chemistry in 1958 for his work 
on sequencing proteins. He was awarded a 
second Nobel Prize in Chemistry in 1980 for 
developing methods of sequencing DNA. 

corresponds to a Phe (F) residue (1226.6 — 1079.5 = 147.1), meaning that Phe (F) is the 
residue at one end of the tryptic peptide. Another large fragment might have a molecu- 
lar weight of 1098.5 and the difference (1226.6 — 1098.1) is the exact molecular weight 
of a Lys (K) residue. Thus, Lys (K) is the residue at the other end of the peptide. This has 
to be the C-terminal end since you know that trypsin cleaves after lysine or arginine 
residues. You can get the exact sequence of the peptide by analyzing the masses of all 
fragments in this manner. One of them will have a molecular weight of 258.0 and that is 
almost certainly the dipeptide Glu-Glu (EE). (The actual analysis is a bit more compli- 
cated than this but the principle is the same.) 

But it’s often not necessary to do the second mass spectrometry analysis in order to 
identify an unknown protein. Since your unkown protein is from a species whose 
genome has been sequenced you can simply compare the tryptic fingerprint to the pre- 
dicted fingerprints of all the proteins encoded by all the genes in the genome. The data- 
base consists of a collection of hypothetical peptides produced by analyzing the amino 
acid sequence of each protein including proteins of unknown function that are known 
only from their sequence. In most cases your collection of peptide masses from the 
unknown protein will match only one protein from one of the genes in the database. 

In this case, the match is to human serum albumin, a well known serum protein 
(Figure 3.23). The masses of several of the peptides correspond to the predicted masses 
of the peptides identified in red in the sequence. Take, for example, the peptide of M r = 
1226.59 in the output from the tryptic fingerprint. This is exactly the predicted mass of 
the peptide from residues 35-44 (FKDLGEENFK). (Note that the first trypsin cleavage 
site follows the arginine residue at position 34 and the second cleavage site is after the 
lysine residue at position 44.) 

A single match is not sufficient to identify an unknown protein. In the example 
shown here there are 21 peptide fragments that match the amino acid sequence of 
human serum albumin and this is more than sufficient to uniquely identify the protein. 

In 1953, Frederick Sanger was the first scientist to determine the complete sequence 
of a protein (insulin). In 1958, he was awarded a Nobel Prize for this work. Twenty- two 
years later, Sanger won a second Nobel Prize for pioneering the sequencing of nucleic 
acids. Today we know the amino acid sequences of thousands of different proteins. 
These sequences not only reveal details of the structure of individual proteins but 
also allow researchers to identify families of related proteins and to predict the three- 
dimensional structure, and sometimes the function, of newly discovered proteins. 

3.1 1 Comparisons of the Primary Structures of Proteins Reveal Evolutionary Relationships 79 


























































































































▲ Figure 3.23 

The sequence of human serum albumin. Red residues highlight predicted tryptic peptides and the 
ones identified in the tryptic fingerprint (Figure 3.22) are underlined. 

3.11 Comparisons of the Primary Structures of 
Proteins Reveal Evolutionary Relationships 

In many cases workers have obtained sequences of the same protein from a number of dif- 
ferent species. The results show that closely related species contain proteins with very simi- 
lar amino acid sequences and that proteins from distantly related species are much less sim- 
ilar in sequence. The differences reflect evolutionary change from a common ancestral 
protein sequence. As more and more sequences were determined it soon became clear that 
one could construct a tree of similarities and this tree closely resembled the phylogenetic 
trees constructed from morphological comparisons and the fossil record. The evidence 
from molecular data was producing independent confirmation of the history of life. 

The first sequence-based trees were published almost 50 years ago. One of the earli- 
est examples was the tree for cytochrome c — a single polypeptide chain of approxi- 
mately 104 residues. It provides us with an excellent example of evolution at the molec- 
ular level. Cytochrome c is found in all aerobic organisms and the protein sequences 
from distantly related species, such as mammals and bacteria, are similar enough to 
confidently conclude that the proteins are homologous. (Different proteins and genes are 
defined as homologues if they have descended from a common ancestor. The evidence 
for homology is based on sequence similarity.) 

The first step in revealing evolutionary relationships is to align the amino acid se- 
quences of proteins from a number of species. Figure 3.24 shows an example of such an 
alignment for cytochrome c. The alignment reveals a remarkable conservation of 
residues at certain positions. For example, every sequence contains a proline at position 
30 and a methionine at position 80. In general, conserved residues contribute to the 
structural stability of the protein or are essential for its function. 

There is selection against any amino acid substitutions at these invariant posi- 
tions. A limited number of substitutions are observed at other sites. In most cases, the 
allowed substitutions are amino acid residues with similar properties. For example, 
position 20 can be occupied by leucine, isoleucine, or valine — these are all hydropho- 
bic residues. Similarly, many sites can be occupied by a number of different polar 
residues. Some positions are highly variable — residues at these sites contribute very 
little to the structure and function of the protein. The majority of observed amino 
acid substitutions in homologous proteins are neutral with respect to natural selection. 
The fixation of substitutions at such positions during evolution is due to random ge- 
netic drift and the phylogenetic tree represents proteins that have the same fuction 
even though they have different amino acid sequences. 

The function of cytochrome c is 
described in Section 14.7. 


Homology is a conclusion that is based 
on evidence such as sequence similarity. 
Homologous proteins descend from a 
common ancestor. There are degrees of 
sequence similarity (e.g., 75% identity), 
but homology is an all-or-nothing 
conclusion. Something is either 
homologous or it isn’t. 


CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

Figure 3.24 ► 

Cytochrome c sequences. The sequences of cytochrome c proteins from various species are aligned 
to show their similarities. In some cases, gaps (signified by hyphens) have been introduced to im- 
prove the alignment. The gaps represent deletions and insertions in the genes that encode these 
proteins. For some species, additional residues at the ends of the sequence have been omitted. 
Hydrophobic residues are blue and polar residues are red. 

The cytochrome c sequences of humans and chimpanzees are identical. This is a re- 
flection of their close evolutionary relationship. The monkey and macaque sequences 
are very similar to the human and chimpanzee sequences as expected since all four 
species are primates. Similarly, the sequences of the plant cytochrome c molecules re- 
semble each other much more than they resemble any of the other sequences. 

Figure 3.25 illustrates the similarities between cytochrome c sequences in different 
species by depicting them as a tree whose branches are proportional in length to the 
number of differences in the amino acid sequences of the protein. Species that are closely 
related cluster together on the same branches of the tree because their proteins are very 
similar. At great evolutionary distances the number of differences may be very large. For 
example, the bacterial sequences differ substantially from the eukaryotic sequences 
reflecting divergence from a common ancestor that lived several billion years ago. The 
tree clearly reveals the three main kingdoms of eukaryotes — fungi, animals, and plants. 
(Protist sequences are not included in this tree in order to make it less complicated.) 

Note that every species has changed since divurging from their common ancastor. 

Candida kloeckeri 


Zebra, chimpanzee 
horse v Macaquej Monkey 

RabbitK Penguin 

Gray /-Chicken, turkey 


xoo/ Pl 9 eon . 

^-Snapping turtle 




► Figure 3.25 

Phylogenetic tree for cytochrome c. The 

length of the branches reflects the number 
of differences between the sequences of 
many cytochrome c proteins. [Adapted from 
Schwartz, R. M., and Dayhoff, M. 0. 
(1978). Origins of prokaryotes, eukaryotes, 
mitochondria, and chloroplasts. Science 










50 60 























Spider monkey 




















































Gray whale 





















































































































King penguin 













Snapping turtle 


























Bull frog 



















































Fruit fly 
































































Mung bean 













































































































































3.1 1 Comparisons of the Primary Structures of Proteins Reveal Evolutionary Relationships 

82 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 


1. Proteins are made from 20 standard amino acids each of which 
contains an amino group, a carboxyl group, and a side chain, or 
R group. Except for glycine, which has no chiral carbon, all amino 
acids in proteins are of the L configuration. 

2. The side chains of amino acids can be classified according to their 
chemical structures — aliphatic, aromatic, sulfur containing, alco- 
hols, bases, acids, and amides. Some amino acids are further clas- 
sified as having highly hydrophobic or highly hydrophilic side 
chains. The properties of the side chains of amino acids are im- 
portant determinants of protein structure and function. 

3. Cells contain additional amino acids that are not used in protein 
synthesis. Some amino acids can be chemically modified to pro- 
duce compounds that act as hormones or neurotransmitters. Some 
amino acids are modified after incorporation into polypeptides. 

4. At pH 7, the o;-carboxyl group of an amino acid is negatively 
charged ( — COO®) and the a-amino group is positively charged 
( — NH 3 ®). The charges of ionizable side chains depend on both 
the pH and their p K a values. 

5. Amino acid residues in proteins are linked by peptide bonds. The 
sequence of residues is called the primary structure of the protein. 

6. Proteins are purified by methods that take advantage of the differ- 
ences in solubility, net charge, size, and binding properties of in- 
dividual proteins. 

7. Analytical techniques such as SDS-PAGE and mass spectrometry 
reveal properties of proteins such as molecular weight. 

8. The amino acid composition of a protein can be determined 
quantitatively by hydrolyzing the peptide bonds and analyzing the 
hydrolysate chromatographically. 

9. The sequence of a polypeptide chain can be determined by the 
Edman degradation procedure in which the N-terminal residues 
are successively cleaved and identified. 

10. Proteins with very similar amino acid sequences are homolo- 
gous — they descend from a common ancestor. 

11. A comparison of sequences from different species reveals evolu- 
tionary relationships. 


1. Draw and label the stereochemical structure of L-cysteine. Indi- 
cate whether it is R or S by referring to Box 3.2 on page 61. 

2. Show that the Fischer projection of the common form of threo- 
nine (page 60) corresponds to 2 S, 3R-threonine. Draw and name 
the three other isomers of threonine. 

3. Histamine dihydrochloride is administered to melanoma (skin 
cancer) patients in combination with anticancer drugs because it 
makes the cancer cells more receptive to the drugs. Draw the 
chemical structure of histamine dihydrochloride. 

4. Dried fish treated with salt and nitrite has been found to contain 
the mutagen 2-chloro-4-methylthiobutanoic acid (CMBA). From 
what amino acid is CMBA derived? 


H 3 c — .CH 

CH 2 

3V “ , ^ n 2\ _ , \ 





5. For each of the following modified amino acid side chains, iden- 
tify the amino acid from which it was derived and the type of 
chemical modification that has occurred. 

(a) — CH 2 0P0 3 ® 

(b) — CH 2 CH1COO 0 2 2 

(c) — 1 CH 2 24 — NH — C102CH 3 

6. The tripeptide glutathione (GSH) (y-Glu-Cys-Gly) serves a pro- 
tective function in animals by destroying toxic peroxides that are 
generated during aerobic metabolic processes. Draw the chemical 
structure of glutathione. Note: The y symbol indicates that the 
peptide bond between Glu and Cys is formed between the 
y-carboxyl of Glu and the amino group of Cys. 

7. Melittin is a 26-residue polypeptide found in bee venom. In its 
monomeric form, melittin is thought to insert into lipid-rich 
membrane structures. Explain how the amino acid sequence of 
melittin accounts for this property. 

0 1 

H 3 N-Gly-Ile-Gly-Ala-Val-Leu-Lys-Val-Leu-Thr-Gly-Leu 

Pro-Ala-Leu-Ile-Ser-Trp-Ile-Lys-Arg-Lys-Arg-Gln-Gln-NH 2 


8. Calculate the isoelectric points of (a) arginine and (b) glutamate. 

9. Oxytocin is a nonapeptide (a nine-residue peptide) hormone in- 
volved in the milk- releasing response in lactating mammals. The 
sequence of a synthetic version of oxytocin is shown below. What 
is the net charge of this peptide at (a) pH 2.0, (b) pH 8.5, and 
(c) pH 10.7? Assume that the ionizable groups have the pK a val- 
ues listed in Table 3.2. The disulfide bond is stable at pH 2.0, pH 
8.5, and pH 10.7. Note that the C-terminus is amidated. 

Cys— Phe— lie — Glu— Asn— Cys — Pro— His — Gly — NH 2 

10. Draw the following structures for compounds that would occur 
during the Edman degradation procedure: (a) PTC-Leu-Ala, 

(b) PTH-Ser, (c) PTH-Pro. 

11. Predict the fragments that will be generated from the treatment 
of the following peptide with (a) trypsin, (b) chymotrypsin, and 

(c) S. aureusYS protease. 


Problems 83 

12. The titration curve for histidine is shown below. The p K a values 
are 1.8 ( — COOH), 6.0 (side chain), and 9.3 ( — NH 3 ®). 

(a) Draw the structure of histidine at each stage of ionization. 

(b) Identify the points on the titration curve that correspond to 
the four ionic species. 

(c) Identify the points at which the average net charge is +2, +0.5 
and —1. 

(d) Identify the point at which the pH equals the ipK a of the side 

(e) Identify the point that indicates complete titration of the side 

(f ) In what pH ranges would histidine be a good buffer? 

13 . You have isolated a decapeptide (a 10-residue peptide) called FP, 
which has anticancer activity. Determine the sequence of the pep- 
tide from the following information. (Note that amino acids are 
separated by commas when their sequence is not known.) 

(a) One cycle of Edman degradation of intact FP yields 2 mol of 
PTH- aspartate per mole of FP. 

(b) Treatment of a solution of FP with 2-mercaptoethanol fol- 
lowed by the addition of trypsin yields three peptides with 
the composition (Ala, Cys, Phe), (Arg, Asp), and (Asp, Cys, 
Gly, Met, Phe). The intact (Ala, Cys, Phe) peptide yields 
PTH-cysteine in the first cycle of Edman degradation. 

(c) Treatment of 1 mol of FP with carboxypeptidase (which 
cleaves the C-terminal residue from peptides) yields 2 mol of 

(d) Treatment of the intact pentapeptide (Asp, Cys, Gly, Met, 
Phe) with CNBr yields two peptides with the composition 
(homoserine lactone, Asp) and (Cys, Gly, Phe). The (Cys, Gly, 
Phe) peptide yields PTH-glycine in the first cycle of Edman 

14 . A portion of the amino acid sequences for cytochrome c from the 
alligator and bullfrog are given (from Figure 3.24). 

Amino acids 31-50 



(a) Give an example of a substitution involving similar amino 

(b) Give an example of a more radical substitution. 

15 . Several common amino acids are modified to produce biologi- 
cally important amines. Serotonin is a biologically important 
neurotransmitter synthesized in the brain. Low levels of serotonin 
in the brain have been linked to conditions such as depression, 
aggression, and hyperactivity. From what amino acid is serotonin 
derived? Identify the differences in structure between the amino 
acid and serotonin. 


16 . The structure of thyrotropin-releasing hormone (TRH) is shown 
below. TRH is a peptide hormone originally isolated from the ex- 
tracts of hypothalamus. 

(a) How many peptide bonds are present in TRH? 

(b) From what tripeptide is TRH derived? 

(c) What result do the modifications have on the charges of the 
amino and carboxyl-terminal groups? 

CK +h 2 ch 2 

ch 2 o o h 2 C +H 2 o 

\ /II II \ / // 

N— HC c— NH — CH — C N— HC— C 

H | \ 

h 2 c nh 2 


\ / 


17 . Chirality plays a major role in the development of new pharma- 
ceuticals. People with Parkinsons disease have depleted amounts 
of dopamine in their brains. In an effort to increase the amount 
of dopamine in patients, they are given the drug L-dopa which is 
converted to dopamine in the brain. L-Dopa is marketed in an 
enantiomerically pure form, (a) Give the RS designation for 
L-dopa. (b) From which amino acid are both L-dopa and dopamine 


co 2 

84 CHAPTER 3 Amino Acids and the Primary Structures of Proteins 

18. Generations of biochemistry students have encountered a ques- 
tion like the one below on their final exam. 

Calculate the approximate concentration of the uncharged form 
of alanine (see below) in a 0.01 M solution of alanine at (a) pH 2.4 
(b) pH 6.15 and (c) pH 9.9. 

H 2 N — CH— COOH 

Can you answer the question without peeking at the solution? 

19. A solution of 0.0 1M alanine is adjusted to pH 2.4 by adding 
NaOH. What is the concentration of the zwitterion in this solu- 
tion? What would it be if the pH was 4.0? 

Selected Readings 


Creighton, T. E. (1993). Proteins: Structures and 
Molecular Principles , 2nd ed. (New York: W. H. 
Freeman), pp. 1-48. 

Greenstein, J. P., and Winitz, M. (1961). Chemistry 
of the Amino Acids (New York: John Wiley 8c 

Kreil, G. (1997). D-Amino Acids in Animal Pep- 
tides. Annu. Rev. Biochem. 66:337-345. 

Meister, A. (1965). Biochemistry of the Amino 
Acids , 2nd ed. (New York: Academic Press). 

Protein Purification and Analysis 

Hearn, M. T. W. (1987). General strategies in the 
separation of proteins by high-performance liquid 
chromatographic methods./. Chromatogr. 418:3-26. 

Mann, M., Hendrickson, R.C., and Pandry, A. 
(2001) Analysis of Proteins and Proteomes by 
Mass Spectrometry. Annu. Rev. Biochem. 

Sherman, L. S., and Goodrich, J. A. (1985). The 
historical development of sodium dodecyl 
sulphate-polyacrylamide gel electrophoresis. 
Chem. Soc. Rev. 14:225-236. 

Stellwagen, E. (1990). Gel filtration. Methods Enzy- 
mol. 182:317-328. 

Amino Acid Analysis and Sequencing 

Doolittle, R. F. (1989). Similar amino acid se- 
quences revisited. Trends Biochem. Sci. 


Han, K. -K., Belaiche, D., Moreau, O., and Briand, 
G. (1985). Current developments in stepwise 
Edman degradation of peptides and proteins. Int. 

J. Biochem. 17:429-445. 

Hunkapiller, M. W., Strickler, J. E., and Wilson, K. J. 
(1984). Contemporary methodology for protein 
structure determination. Science 226:304-31 1. 

Ozols, J. (1990). Amino acid analysis. Methods 
Enzymol. 182:587-601. 

Sanger, F. (1988). Sequences, sequences, and se- 
quences. Annu. Rev. Biochem. 57:1-28. 

Proteins: Three-Dimensional 
Structure and Function 

W e saw in the previous chapter that a protein can be described as a chain of 
amino acids joined by peptide bonds in a specific sequence. However, 
polypeptide chains are not simply linear but are also folded into compact 
shapes that contain coils, zigzags, turns, and loops. Over the last 50 years the three- 
dimensional shapes, or conformations, of thousands of proteins have been determined. A 
conformation is a spatial arrangement of atoms that depends on the rotation of a bond or 
bonds. The conformation of a molecule, such as a protein, can change without breaking 
covalent bonds whereas the various configurations of a molecule can be changed only by 
breaking and re-forming covalent bonds. (Recall that the L and D forms of amino acids 
represent different configurations.) Each protein has an astronomical number of poten- 
tial conformations. Since every amino acid residue has a number of possible conforma- 
tions and since there are many residues in a protein. Nevertheless, under physiological 
conditions most proteins fold into a single stable shape known as its native conforma- 
tion. A number of factors constrain rotation around the covalent bonds in a polypep- 
tide chain in its native conformation. These include the presence of hydrogen bonds 
and other weak interactions between amino acid residues. The biological function of a 
protein depends on its native three-dimensional conformation. 

A protein may be a single polypeptide chain or it may be composed of several 
polypeptide chains bound to each other by weak interactions. As a general rule, each 
polypeptide chain is encoded by a single gene although there are some interesting ex- 
ceptions to this rule. The size of genes and the polypeptides they encode can vary by 
more than an order of magnitude. Some polypeptides contain only 100 amino acid 
residues with a relative molecular mass of about 11,000 (M r = 11,000) (Recall that the 
average relative molecular mass of an amino acid residue of a protein is 110.) On the 
other hand, some very large polypeptide chains contain more than 2000 amino acid 
residues (M r = 220,000). 

From the intensity of the spots near 
the centre , we can infer that the pro- 
tein molecules are relatively dense 
globular bodies , perhaps joined to- 
gether by valency bridges , but in any 
event separated by relatively large 
spaces which contain water. From the 
intensity of the more distant spots , it 
can be inferred that the arrangement 
of atoms inside the protein molecule is 
also of a perfectly definite kind , al- 
though without the periodicities char- 
acterising the fibrous proteins. The ob- 
servations are compatible with oblate 
spheroidal molecules of diameters about 
25 A. and 35 A., arranged in hexago- 
nal screw-axis. ... At this stage , such 
ideas are merely speculative , but now 
that a crystalline protein has been 
made to give X-ray photographs , it is 
clear that we have the means of check- 
ing them and, by examining the struc- 
ture of all crystalline proteins , arriving 
at a far more detailed conclusion about 
protein structure than previous physi- 
cal or chemical methods have been 
able to give. 

— Dorothy Crowfoot Hodgkin (1 934) 

Top: Bighorn sheep. The skin, wool, and horns are composed largely of fibrous proteins. 


86 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Classes of proteins are described in 
the introduction to Chapter 3, and the 
various classes of enzymes are 
described in Section 5.1. 

The terms globular proteins and fibrous 
proteins are rarely used in modern sci- 
entific publications. There are many 
proteins that don’t fit into either category. 

In some species, the size and sequence of every polypeptide can be determined 
from the sequence of the genome. There are about 4000 different polypeptides in the 
bacterium Escherichia coli with an average size of about 300 amino acid residues 
(M r = 33,000). The fruit fly Drosophila melanogaster contains about 14,000 different 
polypeptides with an average size about the same as that in bacteria. Humans and other 
mammals have about 20,000 different polypeptides. The study of large sets of proteins, 
such as the entire complement of proteins produced by a cell, is part of a field of study 
called proteomics. 

Proteins come in a variety of shapes. Many are water-soluble, compact, roughly 
spherical macromolecules whose polypeptide chains are tightly folded. Such proteins — 
traditionally called globular proteins — characteristically have a hydrophobic interior and 
a hydrophilic surface. They possess indentations or clefts that specifically recognize and 
transiently bind other compounds. By selectively binding other molecules these pro- 
teins serve as dynamic agents of biological action. Many globular proteins are 
enzymes — the biochemical catalysts of cells. About 31% of the polypeptides in E. coli are 
classical metabolic enzymes such as those described in the next few chapters. Other pro- 
teins include various factors, carrier proteins, and regulatory proteins; 12% of the 
known proteins in E. coli fall into these categories. 

Polypeptides can also be components of large subcellular or extracellular structures 
such as ribosomes, flagella and cilia, muscle, and chromatin. Fibrous proteins are a partic- 
ular class of structural proteins that provide mechanical support to cells or organisms. 
Fibrous proteins are typically assembled into large cables or threads. Examples of 
fibrous proteins are a-keratin, the major component of hair and nails, and collagen, the 
major protein component of tendons, skin, bones, and teeth. Other examples of structural 
proteins include the protein components of viruses, bacteriophages, spores, and pollen. 

► Escherichia coli proteins. Proteins from 
E. coli cells are separated by two-dimensional 
gel electrophoresis. In the first dimension, 
the proteins are separated by a pH gradient 
where each protein migrates to its isoelec- 
tric point. The second dimension separates 
proteins by size on an SDS-polyacrylamide 
gel. Each spot corresponds to a single 
polypeptide. There are about 4000 different 
proteins in E. coli, but some of them are 
present in very small quantities and can’t be 
seen on this 2-D gel. This figure is from the 
Swiss-2D PAGE database. You can visit this 
site and click on any one of the spots to find 
out more about a particular protein. 

4.1 There Are Four Levels of Protein Structure 87 

Many proteins are either integral components of membranes or membrane-associated 
proteins. Membrane proteins account for at least 16% of the polypeptides in E. coli and 
a much higher percentage in eukaryotic cells. 

This chapter describes the molecular architecture of proteins. We will explore the 
conformation of the peptide bond and see that two simple shapes, the a helix and the 
/ 3 sheet, are common structural elements in all classes of proteins. We will describe 
higher levels of protein structure and discuss protein folding and stabilization. Finally, 
we will examine how protein structure is related to function using collagen, hemoglo- 
bin, and antibodies as examples. Above all, we will learn that proteins have properties 
beyond those of free amino acids. Chapters 5 and 6 describe the role of proteins as en- 
zymes. The structures of membrane proteins are examined in more detail in Chapter 9 
and proteins that bind nucleic acids are covered in Chapters 20 to 22. 

4.1 There Are Four Levels of Protein Structure 

Individual protein molecules have up to four levels of structure (Figure 4.1). As noted in 
Chapter 3, primary structure describes the linear sequence of amino acid residues in a 
protein. The three-dimensional structure of a protein is described by three additional 
levels: secondary structure, tertiary structure, and quaternary structure. The forces re- 
sponsible for maintaining, or stabilizing, these three levels are primarily noncovalent. 

Secondary structure refers to regularities in local conformations maintained by hy- 
drogen bonds between amide hydrogens and carbonyl oxygens of the peptide back- 
bone. The major secondary structures are a helices, /3 strands, and turns. Cartoons 
showing the structures of folded proteins usually represent ct-helical regions by helices 
and (3 strands by broad arrows pointing in the N-terminal to C- terminal direction. 

Tertiary structure describes the completely folded and compacted polypeptide chain. 
Many folded polypeptides consist of several distinct globular units linked by a short 
stretch of amino acid residues as shown in Figure 4.1c. Such units are called domains. 
Tertiary structures are stabilized by the interactions of amino acid side chains in non- 
neighboring regions of the polypeptide chain. The formation of tertiary structure 
brings distant portions of the primary and secondary structures close together. 

(a) Primary structure 

(c) Tertiary structure 


(b) Secondary structure 

a helix 

/3 sheet 

(d) Quaternary structure 

◄ Figure 4.1 

Levels of protein structure, (a) The linear 
sequence of amino acid residues defines the 
primary structure, (b) Secondary structure 
consists of regions of regularly repeating 
conformations of the peptide chain such as 
a helices and /3 sheets, (c) Tertiary structure 
describes the shape of the fully folded 
polypeptide chain. The example shown has 
two domains, (d) Quaternary structure refers 
to the arrangement of two or more polypep- 
tide chains into a multisubunit molecule. 


CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Some proteins possess quaternary structure — the association of two or more 
polypeptide chains into a multisubunit, or oligomeric, protein. The polypeptide chains 
of an oligomeric protein may be identical or different. 

4.2 Methods for Determining Protein Structure 

As we saw in Chapter 3, the amino acid sequence of polypeptides (i.e., primary struc- 
ture) can be determined directly by sequencing the protein or indirectly by sequencing 
the gene. The usual technique for determining the three-dimensional conformation of a 
protein is X-ray crystallography. In this technique, a beam of collimated (parallel) 
X rays is aimed at a crystal of protein molecules. Electrons in the crystal diffract the 
X rays that are then recorded on film or by an electronic detector (Figure 4.2). Mathe- 
matical analysis of the diffraction pattern produces an image of the electron clouds sur- 
rounding atoms in the crystal. This electron density map reveals the overall shape of the 
molecule and the positions of each of the atoms in three-dimensional space. By com- 
bining these data with the principles of chemical bonding it is possible to deduce the lo- 
cation of all the bonds in a molecule and hence its overall structure. The technique of 
X-ray crystallography has developed to the point where it is possible to determine the 
structure of a protein without precise knowledge of the amino acid sequence. In prac- 
tice, knowledge of the primary structure makes fitting of the electron density map 
much easier at the stage where chemical bonds between atoms are determined. 

Initially, X-ray crystallography was used to study the simple repeating units of fibrous 
proteins and the structures of small biological molecules. Dorothy Crowfoot Hodgkin was 
one of the early pioneers in the application of X-ray crystallography to biological mole- 
cules. She solved the structure of penicillin in 1947 and developed many of the techniques 
used in the study of large proteins. Hodgkin received the Nobel Prize in 1964 for deter- 
mining the structure of vitamin B 12 and she later published the structure of insulin. 

The chief impediment to determining the three-dimensional structure of an entire 
protein was the difficulty of calculating atomic positions from the positions and inten- 
sities of diffracted X-ray beams. Not surprisingly, the development of X-ray crystallog- 
raphy of macromolecules closely followed the development of computers. By 1962, 
John C. Kendrew and Max Perutz had elucidated the structures of the proteins myo- 
globin and hemoglobin, respectively, using large and very expensive computers at 
Cambridge University in the United Kingdom. Their results provided the first insights 
into the nature of the tertiary structures of proteins and earned them a Nobel Prize in 
1962. Since then, the structures of many proteins have been revealed by X-ray crystal- 
lography. In recent years, there have been significant advances in the technology due to 
the availability of inexpensive high-speed computers and improvements in producing 
focused beams of X rays. The determination of protein structures is now limited mainly 

Figure 4.2 ► 

X-ray crystallography, (a) Diagram of X rays 
diffracted by a protein crystal, (b) X-ray dif- 
fraction pattern of a crystal of adult human 
deoxyhemoglobin. The location and intensity 
of the spots are used to determine the three- 
dimensional structure of the protein. 


of X rays 


Beam of 
X rays 



4.2 Methods for Determining Protein Structure 89 

◄ Bioinformatics in the 1950s. Bror Strand- 
berg (left) and Dick Dickerson (right) carry- 
ing computer tapes from the EDSAC II 
computer center in Cambridge, UK. The 
tapes contain X-ray diffraction data from 
crystals of myoglobin. 

by the difficulty of preparing crystals of a quality suitable for X-ray diffraction and even 
that step is mostly carried out by computer- driven robots. 

A protein crystal contains a large number of water molecules and it is often possi- 
ble to diffuse small ligands such as substrate or inhibitor molecules into the crystal. In 
many cases, the proteins within the crystal retain their ability to bind these ligands and 
they often exhibit catalytic activity. The catalytic activity of enzymes in the crystalline 
state demonstrates that the proteins crystallize in their in vivo native conformations. 
Thus, the protein structures solved by X-ray crystallography are accurate representa- 
tions of the structures that exist inside cells. 

Once the three-dimensional coordinates of the atoms of a macromolecule have 
been determined, they are deposited in a data bank where they are available to other 
scientists. Biochemists were among the early pioneers in exploiting the Internet to 
share data with researchers around the world — the first public domain databases of 
biomolecular structures and sequences were established in the late 1970s. Many of the 
images in this text were created using data files from the Protein Data Bank (PDB). 

Visit the website for information on how 
to view three-dimensional structures 
and retrieve data files. 

◄ Max Perutz (1914-2002) (left) and John 
C. Kendrew (1917-1997) (right). Kendrew 
determined the structure of myoglobin and 
Perutz determined the structure of hemoglo- 
bin. They shared the Nobel Prize in 1962. 

90 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 


▲ Figure 4.3 

Bovine ( Bos taurus ) ribonuclease A. Ribonu- 
clease A is a secreted enzyme that hydrolyzes 
RNA during digestion, (a) Space-filling model 
showing a bound substrate analog in black, 
(b) Cartoon ribbon model of the polypeptide 
chain showing secondary structure, (c) View 
of the substrate-binding site. The substrate 
analog (5'-diphosphoadenine-3'-phosphate) 
is depicted as a space-filling model, and the 
side chains of amino acid residues are shown 
as ball-and-stick models. [PDB 1AFK] 

Figure 4.4 ► 

Bovine ribonuclease A NMR structure. The 

figure combines a set of very similar struc- 
tures that satisfy the data on atomic interac- 
tions. Only the backbone of the polypeptide 
chain is shown. Compare this structure with 
that in Figure 4.3b. Note the presence of 
disulfide bridges (yellow), which are not 
shown in the images derived from the X-ray 
crystal structure. [PDB 2AAS]. 

We will list the PDB filename, or accession number, for every protein structure shown in 
this text so that you can view the three-dimensional structure on your own computer. 

There are many ways of depicting the three-dimensional structure of proteins. 
Space-filling models (Figure 4.3a) depict each atom as a solid sphere. Such images re- 
veal the dense, closely packed nature of folded polypeptide chains. Space-filling models 
of structures are used to illustrate the overall shape of a protein and the surface exposed 
to aqueous solvent. One can easily appreciate that the interior of folded proteins is 
nearly impenetrable, even by small molecules such as water. 

The structure of a protein can also be depicted as a simplified cartoon that empha- 
sizes the backbone of the polypeptide chain (Figure 4.3b). In these models, the amino 
acid side chains have been eliminated, making it easier to see how the polypeptide folds 
into a three-dimensional shape. Such models have the advantage of allowing us to see 
into the interior of the protein, and they also reveal elements of secondary structure such 
as a helices and / 3 strands. By comparing the structures of different proteins, it is possible 
to recognize common folds and patterns that can t be seen in space-filling models. 

The most detailed models are those that emphasize the structures of the amino 
acid side chains and the various covalent bonds and weak interactions between atoms 
(Figure 4.3c). Such detailed models are especially important in understanding how a 
substrate binds in the active site of an enzyme. In Figure 4.3c, the backbone is shown in 
the same orientation as in Figure 4.3b. 

Another technique for analyzing the macromolecular structure of proteins is nu- 
clear magnetic resonance (NMR) spectroscopy. This method permits the study of pro- 
teins in solution and therefore does not require the painstaking preparation of crystals. 
In NMR spectroscopy, a sample of protein is placed in a magnetic field. Certain atomic 
nuclei absorb electromagnetic radiation as the applied magnetic field is varied. Because 
absorbance is influenced by neighboring atoms, interactions between atoms that are 
close together can be recorded. By combining these results with the amino acid se- 
quence and known structural constraints it is possible to calculate a number of struc- 
tures that satisfy the observed interactions. 

Figure 4.4 depicts the complete set of structures for bovine ribonuclease A — the 
same protein whose X-ray crystal structure is shown in Figure 4.3. Note that the possible 
structures are very similar and the overall shape of the molecule is easily seen. In some 
cases, the set of NMR structures may represent fluctuations, or “breathing,” of the pro- 
tein in solution. The similarity of the NMR and X-ray crystal structures indicates that the 
protein structures found in crystals accurately represent the structure of the protein in 
solution but in some cases the structures do not agree. Often this is due to disordered 
regions that do not show up in the X-ray crystal structure (Section 4.7D). On very rare 
occasions the protein crystallyzes in a conformation that is not the true native form. The 
NMR structure is thought to be more accurate. 

In general, the NMR spectra for small proteins such as ribonuclease A can be easily 
solved but the spectrum of a large molecule can be extremely complex. For this reason, it 
is very difficult to determine the structure of larger proteins but the technique is very 
powerful for smaller proteins. 

4.3 The Conformation of the Peptide Group 91 

4.3 The Conformation of the Peptide Group 

Our detailed study of protein structure begins with the structure of the peptide bonds 
that link amino acids in a polypeptide chain. The two atoms involved in the peptide 
bond, along with their four substituents (the carbonyl oxygen atom, the amide hydro- 
gen atom, and the two adjacent a-carbon atoms), constitute the peptide group. X-ray 
crystallographic analyses of small peptides reveal that the bond between the carbonyl 
carbon and the nitrogen is shorter than typical C — N single bonds but longer than typ- 
ical C=N double bonds. In addition, the bond between the carbonyl carbon and the 
oxygen is slightly longer than typical C=0 double bonds. These measurements reveal 
that peptide bonds have some double-bond properties and can best be represented as a 
resonance hybrid (Figure 4.5). 

Note that the peptide group is polar. The carbonyl oxygen has a partial negative 
charge and can serve as a hydrogen acceptor in hydrogen bonds. The nitrogen has a par- 
tial positive charge, and the — NH group can serve as a hydrogen donor in hydrogen 
bonds. Electron delocalization and the partial double-bond character of the peptide 
bond prevent unrestricted free rotation around the C — N bond. As a result, the atoms 
of the peptide group lie in the same plane (Figure 4.6). Rotation is still possible around 
each N — C a bond and each C a — C bond in the repeating N — C a — C backbone of 
proteins. As we will see, restrictions on free rotation around these two additional bonds 
ultimately determine the three-dimensional conformation of a protein. 

Because of the double-bond nature of the peptide bond, the conformation of the 
peptide group is restricted to one of two possible conformations, either trans or cis 
(Figure 4.7). In the trans conformation, the two a-carbons of adjacent amino acid 
residues are on opposite sides of the peptide bond and at opposite corners of the rectan- 
gle formed by the planar peptide group. In the cis conformation, the two a-carbons are 
on the same side of the peptide bond and are closer together. The cis and trans confor- 
mations arise during protein synthesis when the peptide bond is formed by joining 
amino acids to the growing polypeptide chain. The two conformations are not easily 
interconverted by free rotation around the peptide bond once it has formed. 

The cis conformation is less favorable than the extended trans conformation be- 
cause of steric interference between the side chains attached to the two a-carbon atoms. 
Consequently, nearly all peptide groups in proteins are in the trans conformation. Rare 
exceptions occur, usually at bonds involving the amide nitrogen of proline. Because of 
the unusual ring structure of proline, the cis conformation creates only slightly more 
steric interference than the trans conformation. 

Remember that even though the atoms of the peptide group lie in a plane, rotation is 
still possible about the N — C a and C a — C bonds in the repeating N — C a — C backbone. 
This rotation is restricted by steric interference between main-chain and side-chain atoms 
of adjacent residues. One of the most important restrictions on free rotation is steric in- 
terference between carbonyl oxygens on adjacent amino acid residues in the polypeptide 







a 2 



— C 


1 V/ 






II V/ 

-ft I 


▲ Figure 4.5 

Resonance structure of the peptide bond. 

(a) In this resonance form, the peptide bond 
is shown as a single C — N bond, (b) In this 
resonance form, the peptide bond is shown 
as a double bond, (c) The actual structure is 
best represented as a hybrid of the two reso- 
nance forms in which electrons are delocal- 
ized over the carbonyl oxygen, the carbonyl 
carbon, and the amide nitrogen. Rotation 
around the C — N bond is restricted due to 
the double-bond nature of the resonance 
hybrid form. 


H /R2 

JC «2 





R, H 


i, ft 

O R 3 H 

▲ Figure 4.6 

Planar peptide groups in a polypeptide chain. 

A peptide group consists of the N — H and 
C=0 groups involved in formation of the 
peptide bond, as well as the a-carbons on 
each side of the peptide bond. Two peptide 
groups are highlighted in this diagram. 

◄ Figure 4.7 

Trans and cis conformations of a peptide group. 

Nearly all peptide groups in proteins are in 
the trans conformation, which minimizes 
steric interference between adjacent side 
chains. The arrows indicate the direction 
from the N- to the C-terminus. 

# u-carbon O Hydrogen Q Oxygen 

O Carbonyl carbon O Nitrogen O Side chain 

92 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Figure 4.8 ► 

Rotation around the N — C a and C a — C bonds 
that link peptide groups in a polypeptide chain. 

(a) Peptide groups in an extended conforma- 
tion. (b) Peptide groups in an unstable confor- 
mation caused by steric interference between 
carbonyl oxygens of adjacent residues. The 
van der Waals radii of the carbonyl oxygen 
atoms are shown by the dashed lines. The 
rotation angle around the N — C a bond is 
called (p (phi), and that around the C a — C 
bond is called if/ (psi). The substituents of 
the outer a-carbons have been omitted for 

# u-carbon O Hydrogen 

O Carbonyl carbon O Nitrogen 

Side chain 

chain (Figure 4.8). The presence of bulky side chains also restricts free rotation around 
the N — C a and C a — C bonds. Proline is a special case — rotation around the N — C a 
bond is constrained because it is part of the pyrrolidine ring structure of proline. 

The rotation angle around the N — C a bond of a peptide group is designated cp (phi), 
and that around the C a — C bond is designated ip (psi). The peptide bond angle is co 
(omega). Because rotation around peptide bonds is hindered by their double-bond char- 
acter, most of the conformation of the backbone of a polypeptide can be described by cp 
and ip. Each of these angles is defined by the relative positions of four atoms of the back- 
bone. Clockwise angles are positive, and counterclockwise angles are negative, with each 
having a 180° sweep. Thus, each of the rotation angles can range from —180° to +180°. 

The biophysicist G. N. Ramachandran and his colleagues constructed space-filling 
models of peptides and made calculations to determine which values of and ip are 
sterically permitted in a polypeptide chain. Permissible angles are shown as shaded re- 
gions in Ramachandran plots of cp versus ip. Figure 4.9a shows the results of theoretical 
calculations — the dark, shaded regions represent permissible angles for most residues, 
and the lighter areas cover the cp and ip values for smaller amino acid residues where the 



▲ Figure 4.9 

Ramachandran plot, (a) Solid lines indicate the range of permissible cp and if/ values based on molecular models. Dashed lines give the outer limits for 
an alanine residue. Large blue dots correspond to values of cp and if/ that produce recognizable conformations such as the a helix and /3 sheets. The 
positions shown for the type II turn are for the second and third residues. The white portions of the plot correspond to values of <p and if/ that were 
predicted to occur rarely, (b) Observed cp and if/ values in known structures. Crosses indicate values for typical residues in a single protein. Residues in 
an a helix are shown in red, /3-strand residues are blue, and others are green. 

4.3 The Conformation of the Peptide Group 93 

R groups don’t restrict rotation. Blank areas on a Ramachandran plot are nonpermissi- 
ble areas, due largely to steric hindrance. The conformations of several types of ideal 
secondary structure fall within the shaded areas, as expected. 

Another version of a Ramachandran plot is shown in Figure 4.9b. This plot is based 
on the observed cp and i/s angles of hundreds of proteins whose structures are known. 
The enclosed inner regions represent angles that are found very frequently, and the 
outer enclosed regions represent angles that are less frequent. Typical observed angles 
for a helices, /3 sheets, and other structures in a protein are plotted. The most important 
difference between the theoretical and observed Ramachandran plots is in the region 
around 0 °cp and —90°i/j. This region should not be permitted according to the modeling 
studies but there are many examples of residues with these angles. It turns out that 
steric clashes are prevented in these regions by allowing a small amount of rotation 
around the peptide bond. The peptide group does not have to be exactly planar — a little 
bit of wiggle is permitted! 

Some bulky amino acid residues have smaller permitted areas. Proline is restricted 
to a cp value of about —60° to —77° because its N — C a bond is constrained by inclusion 
in the pyrrolidine ring of the side chain. In contrast, glycine is exempt from many steric 
restrictions because it lacks a /3-carbon. Thus, glycine residues have greater conforma- 
tional freedom than other residues and have cp and i/s values that often fall outside the 
shaded regions of the Ramachandran plot. 


The three-dimensional conformation of a 
polypeptide backbone is defined by the 
cp (phi) and i/j (psi) angles of rotation 
around each peptide group. 


Almost all peptide groups adopt the trans conformation since 
that is the one favored during protein synthesis. It is much 
more stable than the cis conformation (with one exception). 
Spontaneous switching to the cis conformation is very rare 
and it is almost always accompanied by loss of function since 
the structure of the protein is severely affected. 

However, the activity of some proteins is actually 
regulated by conformation changes due to cis/trans isomer- 
ization. The change in peptide group conformation invari- 
ably takes place at proline residues because the cis conforma- 
tion is almost as stable as the trans conformation. This is the 
one exception to the rule. 

Specific enzymes, called peptidyl prolyl cis/trans iso- 
merases, catalyze the interconversion of cis and trans confor- 
mation at proline residues by transiently destabilizing the 
resonance hybrid structure of the peptide bond and allowing 
rotation. One important class of these enzymes recognizes 
Ser-Pro and Thr-Pro bonds whenever the serine and threo- 
nine residues are phosphorylated. Phosphorylation of amino 
acid residues is an important mechanism of regulation by co- 
valent modification (see Section 5.9D). The gene for this type 
of peptidyl prolyl cis/trans isomerase is called Pinl and it is 
present in all eukaryotes. 

In the small flowering plant, Arabidopsis thalianna , Pinl 
protein acts on some transcription factors that control the tim- 
ing of flowering. When threonine residues are phosphorylated, 
the transcription factors are recognized by Pinl and the confor- 
mation of the Thr-Pro bond is switched from trans to cis. The 
resulting conformational change in the structure of the protein 
leads to activation of the transcription factors and transcription 
of the genes required for producing flowers. Flowering is con- 
siderably delayed when the synthesis of peptidyl prolyl cis/trans 
isomerase is inhibited by mutations in the Pinl gene. 

In humans the cis/trans isomerase encoded by Pinl plays 
a role in regulating gene expression by modifying RNA poly- 
merase, transcription factors, and other proteins. Mutations in 
this gene have been implicated in several hereditary diseases. 
The structure of human peptidyl prolyl cis/trans isomerase is 
shown in Figure 4.23e. 

a Arabidopsis thalianna, also known as thale cress or mouse-ear 
cress, is a relative of mustard. It is a favorite model organism in plant 
biology because it is easy to grow in the laboratory. 

94 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

▲ Linus Pauling (1901-1994), winner of 
the Nobel Prize in Chemistry in 1954 and 
the Nobel Peace Prize in 1962. 

4.4 The « Helix 

The a-helical conformation was proposed in 1950 by Linus Pauling and Robert Corey. 
They considered the dimensions of peptide groups, possible steric constraints, and op- 
portunities for stabilization by formation of hydrogen bonds. Their model accounted 
for the major repeat observed in the structure of the fibrous protein a-keratin. This repeat 
of 0.50 to 0.55 nm turned out to be the pitch (the axial distance per turn) of the a helix. 
Max Perutz added additional support for the structure when he observed a secondary 
repeating unit of 0.15 nm in the X-ray diffraction pattern of a-keratin. The 0.15 nm 
repeat corresponds to the rise of the a helix (the distance each residue advances the 
helix along its axis). Perutz also showed that the a helix was present in hemoglobin, 
confirming that this conformation was present in more complex globular proteins. 

In theory, an a helix can be either a right- or a left-handed screw. The a helices 
found in proteins are almost always right-handed, as shown in Figure 4.10. In an ideal a 
helix, the pitch is 0.54 nm, the rise is 0.15 nm, and the number of amino acid residues 
required for one complete turn is 3.6 (i.e., approximately 3 2/3 residues: one carbonyl 
group, three N — C a — C units, and one nitrogen). Most a helices are slightly distorted 
in proteins but they generally have between 3.5 and 3.7 residues per turn. 

Right-handed a helix 


(advance 0.54 nm 
per turn) 

Rise (advance per 
amino acid residue) 

% u-carbon 
O Carbonyl carbon 
O Hydrogen 
O Nitrogen 

O Oxygen 


O Side chain 

▲ Figure 4.10 

a Helix. A region of a-helical secondary structure is shown with the N-terminus at the bottom and the C-terminus at the top of the figure. Each 
carbonyl oxygen forms a hydrogen bond with the amide hydrogen of the fourth residue further toward the C-terminus of the polypeptide chain. 

The hydrogen bonds are approximately parallel to the long axis of the helix. Note that all the carbonyl groups point toward the C-terminus. In an ideal 
a helix, equivalent positions recur every 0.54 nm (the pitch of the helix), each amino acid residue advances the helix by 0.15 nm along the long axis of 
the helix (the rise), and there are 3.6 amino acid residues per turn. In a right-handed helix the backbone turns in a clockwise direction when viewed 
along the axis from its N-terminus. If you imagine that the right-handed helix is a spiral staircase, you will be turning to the right as you walk down the 

4.4 The a Helix 95 

Within an a helix, each carbonyl oxygen (residue n) of the polypeptide backbone is 
hydrogen-bonded to the backbone amide hydrogen of the fourth residue further to- 
ward the C-terminus (residue n + 4). (The three amino groups at one end of the helix 
and the three carbonyl groups at the other end lack hydrogen-bonding partners within 
the helix.) Each hydrogen bond closes a loop containing 13 atoms — the carbonyl oxy- 
gen, 1 1 backbone atoms, and the amide hydrogen. Thus, an a helix can also be called a 
3.6 13 helix based on its pitch and hydrogen-bonded loop size. The hydrogen bonds 
that stabilize the helix are nearly parallel to the long axis of the helix. 

The ip and ip angles of each residue in an a helix are similar. They cluster around 
a stable region of the Ramachandran plot centered at a cp value of —57° and a ip value of 
—47° (Figure 4.9). The similarity of these values is what gives the a helix a regular, re- 
peating structure. The intramolecular hydrogen bonds between residues n and n + 4 
tend to “lock in” rotation around the N — C a and C a — C bonds restricting the ip and ip 
angles to a relatively narrow range. 

A single intrahelical hydrogen bond would not provide appreciable structural sta- 
bility but the cumulative effect of many hydrogen bonds within an a helix stabilizes this 
conformation. Hydrogen bonds between amino acid residues are especially stable in the 
hydrophobic interior of a protein where water molecules do not enter and therefore 
cannot compete for hydrogen bonding. In an a helix, all the carbonyl groups point to- 
ward the C-terminus. The entire helix is a dipole with a positive N-terminus and a neg- 
ative C-terminus since each peptide group is polar and all the hydrogen bonds point in 
the same direction. 

The side chains of the amino acids in an a helix point outward from the cylinder 
of the helix and they are not involved in the hydrogen bonds that stabilize the a helix 
(Figure 4.11). However, the identity of the side chains affects the stability in other 
ways. Because of this, some amino acid residues are found in a-helical conformations 
more often than others. For example, alanine has a small, uncharged side chain and 
fits well into the ct-helical conformation. Alanine residues are prevalent in the a he- 
lices of all classes of proteins. In contrast, tyrosine and asparagine with their bulky 
side chains are less common in a helices. Glycine, whose side chain is a single hydro- 
gen atom, destabilizes a-helical structures since rotation around its a-carbon is so 
unconstrained. For this reason, many a helices begin or end with glycine residues. 
Proline is the least common residue in an a helix because its rigid cyclic side chain 
disrupts the right-handed helical conformation by occupying space that a neighbor- 
ing residue of the helix would otherwise occupy. In addition, because it lacks a hydro- 
gen atom on its amide nitrogen, proline cannot fully participate in intrahelical hydrogen 
bonding. For these reasons, proline residues are found more often at the ends of a helices 
than in the interior. 

Proteins vary in their a-helical content. In some proteins most of the residues are in 
a helices, whereas other proteins contain very little a-helical structure. The average 
content of a helix in the proteins that have been examined is 26%. The length of a 
helix in a protein can range from about 4 or 5 residues to more than 40 — the average is 
about 12. 

Many a helices have hydrophilic amino acids on one face of the helix cylinder and 
hydrophobic amino acids on the opposite face. The amphipathic nature of the helix is 
easy to see when the amino acid sequence is drawn as a spiral called a helical wheel. The 
a helix shown in Figure 4.11 can be drawn as a helical wheel representing the helix 
viewed along its axis. Because there are 3.6 residues per turn of the helix, the residues 
are plotted every 100° along the spiral (Figure 4.12). Note that the helix is a right-handed 
screw and it is terminated by a glycine residue at the C-terminal end. The hydrophilic 
residues (asparagine, glutamate, aspartate, and arginine) tend to cluster on one side of 
the helical wheel. 

Amphipathic helices are often located on the surface of a protein with the hy- 
drophilic side chains facing outward (toward the aqueous solvent) and the hydropho- 
bic side chains facing inward (toward the hydrophobic interior). For example, the helix 
shown in Figures 4.1 1 and 4.12 is on the surface of the water-soluble liver enzyme alco- 
hol dehydrogenase with the side chains of the first, fifth, and eighth residues 

▲ Figure 4.1 1 

View of a right-handed a helix. The blue rib- 
bon indicates the shape of the polypeptide 
backbone. All the side chains, shown as 
bal l-and-stick models, project outward from 
the helix axis. This example is from residues 
lle-355 (bottom) to Gly-365 (top) of horse 
liver alcohol dehydrogenase. Some hydrogen 
atoms are not shown. [PDB 1ADF]. 

▲ A right-handed a helix. This helix was 
created by Julian Voss-Andreae. It stands 
outside Linus Panling’s childhood home in 
Portland, Oregon, United States. 

96 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Figure 4.12 ► 

a helix in horse liver alcohol dehydrogenase. 

Highly hydrophobic residues are blue, less 
hydrophobic residues are green, and highly 
hydrophilic residues are red. (a) Sequence of 
amino acids, (b) Helical wheel diagram. 

The known frequencies of various 
amino acid residues in a helices are 
used to predict the secondary structure 
based on the primary sequence alone. 

▲ Figure 4.14 

Leucine zipper region of yeast 
(Saccharomyces cerevisiae). GCN4 protein 
bound to DNA. GCN4 is a transcription reg- 
ulatory protein that binds to specific DNA 
sequences. The DNA-binding region consists 
of two amphipathic a helices, one from each 
of the two subunits of the protein. The side 
chains of leucine residues are shown in 
a darker blue than the ribbon. Only the 
leucine zipper region of the protein is shown 
in the figure. [PDB 1YSA]. 

(isoleucine, phenylalanine, and leucine, respectively) buried in the protein interior 
(Figure 4.13). 

There are many examples of two amphipathic a helices that interact to produce an 
extended coiled-coil structure where the two a helices wrap around each other with 
their hydrophobic faces in contact and their hydrophilic faces exposed to solvent. A 
common structure in DNA-binding proteins is called a leucine zipper (Figure 4.14). The 
name refers to the fact that two a helices are “zippered” together by the hydrophobic 
interactions of leucine residues (and other hydrophobic residues) on one side of an 
amphipathic helix. The ends of the helices form the DNA-binding region of the protein. 

Some proteins contain a few short regions of a 3 10 helix. Like the a helix, the 3 10 
helix is right-handed. The carbonyl oxygen of a 3io helix forms a hydrogen bond with the 
amide hydrogen of residue n + 3 (as opposed to residue n + 4 in an a helix) so the 3io helix 
has a tighter hydrogen-bonded ring structure than the a helix — 10 atoms rather than 
13 — and has fewer residues per turn (3.0) and a longer pitch (0.60 nm) (Figure 4.15). 

▲ Figure 4.13 

Horse ( Equns ferns) liver alcohol dehydrogenase. The amphipathic a helix is highlighted. The side 
chains of highly hydrophobic residues are shown in blue, less hydrophobic residues are green, and 
charged residues are shown in red. Note that the side chains of the hydrophobic residues are di- 
rected toward the interior of the protein and that the side chains of charged residues are exposed to 
the surface. [PDB 1ADF]. 

4.5 (3 Strands and (3 Sheets 97 

The 3 10 helix is slightly less stable than the a helix because of steric hindrances and the 
awkward geometry of its hydrogen bonds. When a 3 10 helix occurs, it is usually only a 
few residues in length and often is the last turn at the C-terminal end of an a helix. 
Because of its different geometry, the ip and ip angles of residues in a 3 10 helix occupy a 
different region of the Ramachandran plot than the residues of an a helix (Figure 4.9). 

4.5 (3 Strands and (3 Sheets 

The other common secondary structure is called p structure, a class that includes 
/ 3 strands and (3 sheets, p Strands are portions of the polypeptide chain that are almost 
fully extended. Each residue in a /3 strand accounts for about 0.32 to 0.34 nm of the 
overall length in contrast to the compact coil of an a helix where each residue corre- 
sponds to 0.15 nm of the overall length. When multiple P strands are arranged side-by- 
side they form p sheets, a structure originally proposed by Pauling and Corey at the 
same time they developed a theoretical model of the a helix. 

Proteins rarely contain isolated P strands because the structure by itself is not sig- 
nificantly more stable than other conformations. However, /3 sheets are stabilized by hy- 
drogen bonds between carbonyl oxygens and amide hydrogens on adjacent p strands. 
Thus, in proteins, the regions of p structure are almost always found in sheets. 

The hydrogen-bonded P strands can be on separate polypeptide chains or on dif- 
ferent segments of the same chain. The P strands in a sheet can be either parallel (run- 
ning in the same N- to C-terminal direction) (Figure 4.16a) or antiparallel (running in 
opposite N- to C-terminal directions) (Figure 4.16b). When the P strands are antiparallel, 
the hydrogen bonds are nearly perpendicular to the extended polypeptide chains. Note 
that in the antiparallel p sheet, the carbonyl oxygen and the amide hydrogen atoms of 
one residue form hydrogen bonds with the amide hydrogen and carbonyl oxygen of a 
single residue in the other strand. In the parallel arrangement, the hydrogen bonds are 
not perpendicular to the extended chains and each residue forms hydrogen bonds with 
the carbonyl and amide groups of two different residues on the adjacent strand. 

Parallel sheets are less stable than antiparallel sheets, possibly because the hydrogen 
bonds are distorted in the parallel arrangement. The P sheet is sometimes called a 
p pleated sheet since the planar peptide groups meet each other at angles, like the folds 
of an accordion. As a result of the bond angles between peptide groups, the amino acid 

▲ Figure 4.15 

The 3 10 helix. In the 3i 0 helix (left) hydrogen 
bonds (pink) form between the amide group 
of one residue and the carbonyl oxygen of a 
residue three positions away. In an a helix 
(right) the carbonyl group bonds to an amino 
acid residue four positions away. 

v Figure 4.16 

p Sheets. Arrows indicate the N- to C-terminal 
direction of the peptide chain, (a) Parallel (3 
sheet. The hydrogen bonds are evenly spaced 
but slanted, (b) Antiparallel (3 sheet. The 
hydrogen bonds are essentially perpendicular 
to the (3 strands, and the space between 
hydrogen -bonded pairs is alternately wide 
and narrow. 






- c x. 



































H Y\ 


/ \ 

































/ \ 

\ H 






H Y\ 



R H 

I \ H 

H V. I 


/ \ 

i — I ^ c ^ „ 
^ C N 







H O 

I H l\ 

c/ C N 


n w 

I I 

O H 

" H I 

-x' c \ I /N . 

^ I N C \C 

Hi I ;| 

H R O 

O R 

H I; I 

i — I X C X.- 

^ C N I C/ 
I I H |i 

R H O 

I I 









I I 






C N 

I I 

R H 











CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

▲ Figure 4.17 

View of two strands of an antiparallel (3 sheet 
from influenza virus A neuraminidase. Only the 
side chains of the front (3 strand are shown. 
The side chains alternate from one side of 
the (3 strand to the other side. Both strands 
have a right-handed twist. [PDB 1BJI] 


There are only three different kinds of 
common secondary structure: a helix, 
p strand, and turns. 

▲ U-turns are allowed in proteins. 

side chains point alternately above and below the plane of the sheet. A typical /3 sheet 
contains from two to as many as 15 individual (3 strands. Each strand has an average of 
six amino acid residues. 

The (3 strands that make up [3 sheets are often twisted and the sheet is usually dis- 
torted and buckled. The three-dimensional view of the (3 sheet of ribonuclease A 
(Figure 4.3) shows a more realistic view of (3 sheets than the idealized structures in 
Figure 4.16. 

A view of two strands of a small (3 sheet is shown in Figure 4.17. The side chains of 
the amino acid residues in the front strand alternately project to the left and to the right of 
(i.e., above and below) the (3 strand, as described above. Typically, (3 strands twist slightly 
in a right-hand direction; that is, they twist clockwise as you look along one strand. 

The <p and if/ angles of the bonds in a [3 strand are restricted to a broad range of val- 
ues occupying a large, stable region in the upper left-hand corner of the Ramachandran 
plot. The typical angles for residues in parallel and antiparallel strands are not identical 
(see Figure 4.9). Because most (3 strands are twisted, the <p and if/ angles exhibit a 
broader range of values than those seen in the more regular a helix. 

Although we usually think of (3 sheets as examples of secondary structure this is 
not, strictly speaking, correct. In many cases, the individual (3 strands are located in dif- 
ferent regions of the protein and only come together to form the (3 sheet when the pro- 
tein adopts its final tertiary conformation. Sometimes the quaternary structure of a 
protein gives rise to a large f3 sheet. Some proteins are almost entirely f3 sheets but most 
proteins have a much lower (3 - strand content. 

In the previous section we noted that amphipathic a helices have hydrophobic 
side chains that project outward on one side of the helix. This is the side that interacts 
with the rest of the protein creating a series of hydrophobic interactions that help sta- 
bilize the tertiary structure. The side chains of / 3 sheets project alternately above and 
below the plane of the (3 strands. One surface may consist of hydrophobic side chains 
that allow the (3 sheet to lie on top of other hydrophobic residues in the interior of the 

An example of such hydrophobic interactions between two (3 sheets is seen in the 
structure of the coat protein of grass pollen grains (Figure 4.18a). This protein is the 
major allergen affecting people who are allergic to grass pollen. One surface of each 
/ 3 sheet contains hydrophobic side chains and the opposite surface has hydrophilic 
side chains. The two hydrophobic surfaces interact to form the hydrophobic core of 
the protein and the hydrophilic surfaces are exposed to solvent as shown in Figure 
4.18b. This is an example of a (3 sandwich, one of several arrangements of secondary 
structural elements that are covered in more detail in the section on tertiary structure 
(Section 4.7). 

4.6 Loops and Turns 

In both an a helix and a (3 strand there are consecutive residues with a similar confor- 
mation that is repeated throughout the structure. Proteins also contain stretches of non- 
repeating three-dimensional structure. Most of these non-repeating regions of secondary 
structure can be characterized as loops or turns since they cause directional changes in the 
polypeptide backbone. The conformations of peptide groups in nonrepetitive regions 
are constrained just as they are in repetitive regions. They have <p and i[/ values that are 
usually well within the permitted regions of the Ramachandran plot and often close 
to the values of residues that form a helices or [3 strands. 

Foops and turns connect a helices and (3 strands and allow the polypeptide chain 
to fold back on itself producing the compact three-dimensional shape seen in the native 
structure. As much as one-third of the amino acid residues in a typical protein are 
found in such nonrepetitive structures. Loops often contain hydrophilic residues and are 
usually found on the surfaces of proteins where they are exposed to solvent and form 
hydrogen bonds with water. Some loops consist of many residues of extended nonrepet- 
itive structure. About 10% of the residues can be found in such regions. 

4.7 Tertiary Structure of Proteins 99 

Loops containing only a few (up to five) residues are referred to as turns if they 
cause an abrupt change in the direction of a polypeptide chain. The most common 
types of tight turns are called reverse turns. They are also called p turns because they 
often connect different antiparallel P strands. (Recall that in order to create a P sheet 
the polypeptide must fold so that two or more regions of P strand are adjacent to one 
another as shown in Figure 4.17.) This terminology is misleading since p turns can also 
connect a helices or an a helix and a P strand. 

There are two common types of p turn, designated type I and type II. Both types 
of turn contain four amino acid residues and are stabilized by hydrogen bonding be- 
tween the carbonyl oxygen of the first residue and the amide hydrogen of the fourth 
residue (Figure 4.19). Both type I and type II turns produce an abrupt (usually about 
180°) change in the direction of the polypeptide chain. In type II turns, the third 
residue is glycine about 60% of the time. Proline is often the second residue in both 
types of turns. 

Proteins contain many turn structures. They all have internal hydrogen bonds that 
stabilize the structure and that’s why they can be considered a form of secondary struc- 
ture. Turns make up a significant proportion of the structure in many proteins. Some of 
the bonds in turn residues have cp and i/j angles that lie outside the “permitted” regions of 
a typical Ramachandran plot (Figure 4.9). This is especially true of residues in the third 
position of type II turns where there is an abrupt change in the direction of the backbone. 
This residue is often glycine so the bond angles can adopt a wider range of values without 
causing steric clashes between the side-chain atoms and the backbone atoms. Ramachandran 
plots usually show only the permitted regions for all residues except glycine — this is why 
the rotation angles of type II turns appear to lie in a restricted area. 



4.7 Tertiary Structure of Proteins 



1 ) 

Tertiary structure results from the folding of a polypeptide (which may already possess 
some regions of a helix and P structure) into a closely packed three-dimensional struc- 
ture. An important feature of tertiary structure is that amino acid residues that are far 
apart in the primary structure are brought together permitting interactions among 
their side chains. Whereas secondary structure is stabilized by hydrogen bonding 
between amide hydrogens and carbonyl oxygens of the polypeptide backbone, tertiary 

▲ Figure 4.18 

Structure of PHL P2 from Timothy grass 
( Phleum pratense ) pollen, (a) The two short, 
two-stranded, antiparallel (3 sheets are high- 
lighted in blue and purple to show their ori- 
entation within the protein, (b) View of the 
/3-sandwich structure in a different orienta- 
tion showing hydrophobic residues (blue) 
and polar residues (red). A number of 
hydrophobic interactions connect the two 
(3 sheets. [PDB 1BMW]. 

(n + 2) 

# u-carbon 
O p- carbon 

O Hydrogen 
O Nitrogen 

O Oxygen 
O Carbon 

▲ Figure 4.19 

Reverse turns, (a) Type I (3 turn. The structure is stabilized by a hydrogen bond between the carbonyl oxygen of the first N-terminal residue (Phe) and 
the amide hydrogen of the fourth residue (Gly). Note the proline residue at position n + 1. (b) Type II (3 turn. This turn is also stabilized by a hydrogen 
bond between the carbonyl oxygen of the first N-terminal residue (Val) and the amide hydrogen of the fourth residue (Asn). Note the glycine residue at 
position n + 2. [PDB 1AHL (giant sea anemone neurotoxin)]. 


CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

structure is stabilized primarily by nonco valent interactions (mostly the hydrophobic 
effect) between the side chains of amino acid residues. Disulfide bridges, though cova- 
lent, are also elements of tertiary structure they are not part of the primary structure 
since they form only after the protein folds. 

A. Supersecondary Structures 

Supersecondary structures, or motifs, are recognizable combinations of a helices, 
/3 strands, and loops that appear in a number of different proteins. Sometimes motifs 
are associated with a particular function although structurally similar motifs may have 
different functions in different proteins. Some common motifs are shown in Figure 4.20. 

One of the simplest motifs is the helix-loop-helix (Figure 4.20a). This structure 
occurs in a number of calcium-binding proteins. Glutamate and aspartate residues in 
the loop of these proteins form part of the calcium-binding site. In certain DNA-binding 
proteins a version of this supersecondary structure is called a helix-turn-helix motif 
since the residues that connect the helices form a reverse turn. In these proteins, the 
residues of the a helices bind DNA. 

The coiled-coil motif consists of two amphipathic a helices that interact through their 
hydrophobic edges (Figure 4.20b) as in the leucine zipper example (Figure 4.14). Several 
a helices can associate to form a helix bundle (Figure 4.20c). In this case, the individual 
a helices have opposite orientations, whereas they are parallel in the coiled-coil motif. 

The /3af3 unit consists of two parallel /3 strands linked to an intervening a helix by 
two loops (Figure 4.20d). The helix connects the C-terminal end of one (3 strand to the 
N-terminal end of the next and often runs parallel to the two strands. A hairpin consists 
of two adjacent antiparallel / 3 strands connected by a [3 turn (Figure 4.20e). (One exam- 
ple of a hairpin motif is shown in Figure 4.16.) 

Figure 4.20 ► 

Common motifs. In folded proteins a helices 
and strands are commonly connected by 
loops and turns to form supersecondary 
structures, shown here as two-dimensional 
representations. Arrows indicate the N- to 
C-terminal direction of the peptide chain. 

(a) Helix-loop-helix (b) Coiled coil (c) Helix bundle 

(g) Greek key 

(h) /3-sandwich 

4.7 Tertiary Structure of Proteins 


The [3 meander motif (Figure 4.20f) is an antiparallel [3 sheet composed of sequen- 
tial (3 strands connected by loops or turns. The order of strands in the (3 sheet is the 
same as their order in the sequence of the polypeptide chain. The (3 meander sheet may 
contain one or more hairpins but, more typically, the strands are joined by larger loops. 
The Greek key motif takes its name from a design found on classical Greek pottery. This 
is a [3 sheet motif linking four antiparallel (3 strands such that strands 3 and 4 form the 
outer edges of the sheet and strands 1 and 2 are in the middle of the sheet. The (3 sandwich 
motif is formed when / 3 strands or sheets stack on top of one another (Figure 4.20h). The 
figure shows an example of a (3 sandwich where the (3 strands are connected by short 
loops and turns, but (3 sandwiches can also be formed by the interaction of two (3 sheets 
in different regions of the polypeptide chain, as seen in Figure 4.18. 

B. Domains 

Many proteins are composed of several discrete, independently folded, compact units 
called domains. Domains may consist of combinations of motifs. The size of a domain 
varies from as few as 25 to 30 amino acid residues to more than 300. An example of a pro- 
tein with multiple domains is shown in Figure 4.21. Note that each domain is a distinct 
compact unit consisting of various elements of secondary structure. Domains are usually 
connected by loops but they are also bound to each other through weak interactions 
formed by the amino acid side chains on the surface of each domain. The top domain of 
pyruvate kinase in Figure 4.21 contains residues 1 16 to 219, the central domain contains 
residues 1 to 1 15 plus 220 to 388, and the bottom domain contains residues 389 to 530. In 
general, domains consist of a contiguous stretch of amino acid residues as in the top and 
bottom domains of pyruvate kinase but in some cases a single domain may contain two or 
more different regions of the polypeptide chain as in the middle domain. 

The evolutionary conservation of protein structure is one of the most important 
observations that has emerged from the study of proteins in the past few decades. This 
conservation is most easily seen in the case of single-domain homologous proteins from 
different species. For example, in Chapter 3 we examined the sequence similarity of cy- 
tochrome c and showed that the similarities in primary structure could be used to con- 
struct a phylogenetic tree that reveals the evolutionary relationships of the proteins 
from different species (Section 3.11). As you might expect, the tertiary structures of cy- 
tochrome c proteins are also highly conserved (Figure 4.22). Cytochrome c is an exam- 
ple of a protein that contains a heme prosthetic group. The conservation of protein 
structure is a reflection of its interaction with heme and its conserved function as an 
electron transport protein in diverse species. 

Some domain structures occur in many different proteins whereas others are unique. In 
general, proteins can be grouped into families according to similarities in domain structures 
and amino acid sequence. All of the members of a family have descended from a common 
ancestral protein. Some biochemists believe that there may be only a few thousand families 

▲ Figure 4.21 

Pyruvate kinase from cat ( Felis domesticus). 

The main polypeptide chain of this common 
enzyme folds into three distinct domains as 
indicated by brackets. [PDB 1PKM]. 

◄ Figure 4.22 

Conservation of cytochrome c structure. 

(a) Tuna ( Thunnus alalunga ) cytochrome 
c bound to heme [PDB 5CYT]. (b) Tuna 
cytochrome c polypeptide chain, (c) Rice 
( Oryza sativa ) cytochrome c [PDB 1CCR]. 
(d) Yeast ( Saccharomyces cerevisiae ) 
cytochrome c [PDB 1YCC]. (e) Bacterial 
{Rhodopila globiformis) cytochrome c 
[PDB 1HR0]. 

102 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 


▲ Figure 4.23 

Structural similarity of lactate and malate de- 
hydrogenase. (a) Bacillus stereothermophilus 
lactate dehydrogenase [PDB 1LDN]. 

(b) Escherichia coli malate dehydrogenase 
[PDB 1EMD]. 

suggesting that all modern proteins are descended from only a few thousand proteins that 
were present in the most primitive organisms living 3 billion years ago. 

Lactate dehydrogenase and malate dehydrogenase are different enzymes that belong 
to the same family of proteins. Their structures are very similar as shown in Figure 4.23. 
The sequences of the proteins are only 23% identical. In spite of the obvious similarity 
in structure, Nevertheless, this level of sequence similarity is significant enough to con- 
clude that the two proteins are homologous. They descend from a common ancestral 
gene that duplicated billions of years ago before the last common ancestor of all extant 
species of bacteria. Both lactate dehydrogenase and malate dehydrogenase are present in 
the same species which is why they are members of a family of related proteins. Protein 
families contain related proteins that are present in the same species. The cytochrome c 
proteins shown in Figure 4.22 are evolutionarily related but strictly speaking they are 
not members of a protein family because there is only one of them in each species. Pro- 
tein familes arise from gene duplication events. 

Protein domains can be classified by their structures. One commonly used classifi- 
cation scheme groups these domains into four categories. The “all- a” category contains 
domains that consist almost entirely of a helices and loops. “A11-/3” domains contain only 
[3 sheets and nonrepetitive structures that link (3 strands. The other two categories con- 
tain domains that have a mixture of a helices and /3 strands. Domains in the u a/f3 ” class 
have supersecondary structures such as the (3a(3 motif and others in which regions of 
a helix and [3 strand alternate in the polypeptide chain. In the “a + [3 ” category, the do- 
mains consist of local clusters of a helices and /3 sheet where each type of secondary 
structure arises from separate contiguous regions of the polypeptide chain. 

Protein domains can be further classified by the presence of characteristic folds 
within each of the four main structural categories. A fold is a combination of secondary 
structures that form the core of a domain. Figure 4.24 on pages 103-104 shows selected 
examples of proteins from each of the main categories and illustrates a number of com- 
mon domain folds. Some domains have easily recognizable folds, such as the / 3 meander 
that contains antiparallel [3 strands connected by hairpin loops (Figure 4.20f), or helix 
bundles (Figure 4.19c). Other folds are more complex (Figure 4.25). 

The important point about Figure 4.24 is not to memorize the structures of com- 
mon proteins and folds. The key concept is that proteins can adopt an amazing variety 
of different sizes and shapes (tertiary structure) even though they contain only three 
basic forms of secondary structure. 

The enzymatic activities of lactate 
dehydrogenase and malate dehydroge- 
nase are compared in Box 7.1. 

C. Domain Structure, Function, and Evolution 

The relationship between domain structure and function is complex. Often a single do- 
main has a particular function such as binding small molecules or catalyzing a single re- 
action. In multifunctional enzymes, each catalytic activity can be associated with one of 
several domains found in a single polypeptide chain (Figure 4.24j). However, in many 
cases the binding of small molecules and the formation of the active site of an enzyme 
take place at the interface between two separate domains. These interfaces often form 
crevices, grooves, and pockets that are accessible on the surface of the protein. The ex- 
tent of contact between domains varies from protein to protein. 

The unique shapes of proteins, with their indentations, interdomain interfaces, and 
other crevices, allow them to fulfill dynamic functions by selectively and transiently 
binding other molecules. This property is best illustrated by the highly specific binding 
of reactants (substrates) to substrate -binding sites, or active sites, of enzymes. Because 
many binding sites are positioned toward the interior of a protein, they are relatively 
free of water. When substrates bind, they fit so well that some of the few remaining 
water molecules in the binding site are displaced. 

D. Intrinsically Disordered Proteins 

This section on tertiary structure wouldn’t be complete without mentioning those pro- 
teins and domains that have no stable three-dimensional structure. These intrinsically 
disordered proteins (and domains) are quite common and the lack of secondary and 
tertiary structure is encoded in the amino acid sequences. There has been selection for 

4.8 Quaternary Structure 103 

clusters of charged residues (positive or negative) and proline residues that maintain 
the polypeptide chain in a disordered state. 

Many of these proteins interact with other proteins. They contain short amino acid 
sequences that serve as binding sites and these binding sites are within the intrinsically 
disordered regions. This allows easy access to the binding site. If a protein contains two 
different binding sites for other proteins then the disordered polypeptide chain acts as a 
tether to bring the two binding proteins closer together. Several transcription factors 
also contain disordered regions when they are not bound to DNA. These regions be- 
come ordered when the proteins interact with DNA. 

4.8 Quaternary Structure 

Many proteins exhibit an additional level of organization called quaternary structure. 
Quaternary structure refers to the organization and arrangement of subunits in a pro- 
tein with multiple subunits. Each subunit is a separate polypeptide chain. A multisub- 
unit protein is referred to as an oligomer (proteins with only one polypeptide chain are 
monomers). The subunits of a multisubunit protein may be identical or different. 
When the subunits are identical, dimers and tetramers predominate. When the subunits 
differ, each type often has a different function. A common shorthand method for de- 
scribing oligomeric proteins uses Greek letters to identify types of subunits and sub- 
script numerals to indicate numbers of subunits. For example, an cv 2 /3y protein contains 
two subunits designated a and one each of subunits designated /3 and y. 

The subunits within an oligomeric protein always have a defined stoichiometry and 
the arrangement of the subunits gives rise to a stable structure where subunits are usu- 
ally held together by weak noncovalent interactions. Hydrophobic interactions are the 
principal forces involved although electrostatic forces may contribute to the proper 
alignment of the subunits. Because intersubunit forces are usually rather weak, the sub- 
units of an oligomeric protein can often be separated in the laboratory. In vivo , however, 
the subunits usually remain tightly associated. 

Examples of several multisubunit proteins are shown in Figure 4.26. In the case of 
triose phosphate isomerase (Figure 4.26a) and HIV protease (Figure 4.26b), the identical 
subunits associate through weak interactions between the side chains found mainly in 
loop regions. Similar interactions are responsible for the formation of the MS2 capsid 
protein that consists of a trimer of identical subunits (Figure 4.26d). In this case, the 
trimer units assemble into a more complex structure — the bacteriophage particle. The 
enzyme HGPRT (Figure 4.26e) is a tetramer formed from the association of two pairs of 
nonidentical subunits. Each of the subunits is a recognizable domain. 

The potassium channel protein (Figure 4.26c) is an example of a tetramer of iden- 
tical subunits where the subunits interact to form a membrane-spanning region con- 
sisting of an eight-helix bundle. The subunits do not form separate domains within the 
protein but instead come together to form a single channel. The bacterial photosystem 
shown in Figure 4.26f is a complex example of quaternary structure. Three of the sub- 
units contribute to a large membrane-bound helix bundle while a fourth subunit (a cy- 
tochrome) sits on the exterior surface of the membrane. 

Determination of the subunit composition of an oligomeric protein is an essential 
step in the physical description of a protein. Typically, the molecular weight of the native 
oligomer is estimated by gel- filtration chromatography and then the molecular weight 
of each chain is determined by SDS-polyacrylamide gel electrophoresis (Section 3.6). 
For a protein having only one type of chain, the ratio of the two values provides the 
number of chains per oligomer. 

The fact that a large proportion of proteins consist of multiple subunits is probably 
related to several factors: 

1. Oligomers are usually more stable than their dissociated subunits suggesting that 
quaternary structure prolongs the life of a protein in vivo. 

2. The active sites of some oligomeric enzymes are formed by residues from adjacent 
polypeptide chains. 


There are only three basic types of 
secondary structure but thousands of 
tertiary folds and domains. 

Speculations on the possible relation- 
ship between protein domains and 
gene organization will be presented 
in Chapter 21. 

The structures and functions of bacteri- 
al and plant photosystems are 
described in Chapter 15. 

104 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 


E. coli cytochrome b 562 

E. coli UDP A/-acetylglucosamine 
acyl transferase 

Human serum albumin 

Human peptidylprolyl 
cis/trans isomerase Cow gamma crystallin 

Jack bean concanavalin A 

Jellyfish green flourescent 

▲ Figure 4.24 

Examples of tertiary structure in selected proteins, (a) Human {Homo sapiens) serum albumin [PDB 1BJ5] (class: all-a). This protein has several do- 
mains consisting of layered a helices and helix bundles, (b) Escherichia coli cytochrome b 5 62 [PDB 1QPU] (class: all-a). This is a heme-binding pro- 
tein consisting of a single four-helix bundle domain, (c) Escherichia coli UDP N-acetylglucosamine acyl transferase [PDB 1LXA] (class: a\\-(3). The 
structure of this enzyme shows a classic example of a £ helix domain, (d) Jack bean ( Canavalia ensiformis ) concanavalin A [PDB ICON] (class: all -f3). 
This carbohydrate-binding protein (lectin) is a single-domain protein made up of a large [3 sandwich fold, (e) Human {Homo sapiens) peptidylprolyl 
cis/trans isomerase [PDB 1VBS] (class: a\\-(3). The dominant feature of the structure is a f3 sandwich fold, (f) Cow {Bos taurus) y-crystallin 
[PDB 1A45] (class: a 11-/3) This protein contains two (3 barrel domains, (g) Jellyfish {Aequorea victoria) green fluorescent protein [PDB 1GFL] (class: 
all -(3). This is a [3 barrel structure with a central a helix. The strands of the sheet are antiparallel, (h) Pig {Sus scrota) retinol-binding protein [PDB 
1AQB] (class: a\\-(3). Retinol binds in the interior of a (3 barrel fold. (I) Brewer’s yeast {Saccharomyces carlsburgensis) old yellow enzyme (FMN oxi- 
doreductase) [PDB 10YA] (class: alp). The central fold is an al(3 barrel with parallel (3 strands connected by a helices. Two of the connecting a heli- 
cal regions are highlighted in yellow, (j) Escherichia colie nzyme required for tryptophan biosynthesis [PDB 1 PI I ] (class: alp). This is a bifunctional 
enzyme containing two distinct domains. Each domain is an example of an a/(3 barrel. The left-hand domain contains the indolglycerol phosphate 

4.8 Quaternary Structure 105 

Yeast FMN oxidoreductase 
(old yellow enzyme) 

E. coli flavodoxin 

Human thioredoxin 

Pig adenylyl kinase 

E. coli thiol-disulfide 

E. coli L-arabinose-binding 

Neisseria gonorrhea pilin 

▲ Figure 4.24 ( continued ) 

synthetase activity, and the right-hand domain contains the phosphoribosylanthranilate isomerase activity, (k) Pig {Sus scrofa) adenylyl kinase 
[PDB 3ADK] (class: alp). This single-domain protein consists of a five-stranded parallel (3 sheet with layers of a helices above and below the sheet. 
The substrate binds in the prominent groove between a helices. (I) Escherichia coli flavodoxin [PDB 1AHN] (class: alp). The fold is a five-stranded 
parallel twisted sheet surrounded by a helices, (m) Human ( Homo sapiens ) thioredoxin [PDB 1ERU] (class: alp). The structure of this protein is 
very similar to that of E. coli flavodoxin except that the five-stranded twisted sheet in the thioredoxin fold contains a single antiparallel strand, 
(n) Escherichia coli L-arabinose-binding protein [PDB 1ABE] (class: alp). This is a two-domain protein where each domain is similar to that in E. coli 
flavodoxin. The sugar L-arabinose binds in the cavity between the two domains, (o) Escherichia coli DsbA (thiol-disulfide oxidoreductase/disulfide iso- 
merase) [PDB 1A23] (class: alp). The predominant feature of this structure is a (mostly) antiparallel (3 sheet sandwiched between a helices. Cysteine 
side chains at the end of one of the a helices are shown (sulfur atoms are yellow), (p) Neisseria gonorrhea pilin [PDB 2PIL] (class: a + p). This 
polypeptide is one of the subunits of the pili on the surface of the bacteria responsible for gonorrhea. There are two distinct regions of the structure: a 
1 3 sheet and a long a helix. 

106 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Figure 4.25 ► 

Common domain folds. 

(a) Parallel twisted sheet 

(b) p barrel 

(c) a/j B barrel (d) ft helix 

3. The three-dimensional structures of many oligomeric proteins change when the pro- 
teins bind ligands. Both the tertiary structures of the subunits and the quaternary 
structures (i.e., the contacts between subunits) may be altered. Such changes are key 
elements in the regulation of the biological activity of certain oligomeric proteins. 

4. Different proteins can share the same subunits. Since many subunits have a defined 
function (e.g., ligand binding), evolution has favored selection for different combi- 
nations of subunits to carry out related functions. This is more efficient than selec- 
tion for an entirely new monomeric protein that duplicates part of the function. 

5. A multisubunit protein may bring together two sequential enzymatic steps where 
the product of the first reaction becomes the substrate of the second reaction. This 
gives rise to an effect known as channeling (Section 5.11). 

As shown in Figure 4.26, the variety of multisubunit proteins ranges from simple 
homodimers such as triose phosphate isomerase to large complexes such as the photo- 
systems in bacteria and plants. We would like to know how many proteins are 
monomers and how many are oligomers but studies of cell proteomes — the complete 
complement of proteins — have only begun. 

Table 4.1 on page 108 shows the results of a survey of E. coli proteins in the SWISS- 
PROT database. Of those polypeptides that have been analyzed, only about 19% are in 
monomers. Dimers are the largest class among the oligomers, and homodimers — where 
the two subunits are identical — represent 31% of all proteins. The next largest class is 
tetramers of identical subunits. Note that trimers are relatively rare. Most proteins exhibit 
dyad symmetry meaning that you can usually draw a line through a protein dividing it 
into two halves that are symmetrical about this axis. This dyad symmetry is seen even in 

4.8 Quaternary Structure 


Human hypoxanthine-guanine 
phosphoribosyl transferase 



▲ Figure 4.26 

Quaternary structure, (a) Chicken {Gallus gal I us) triose phosphate isomerase [PDB 1TIM]. This protein has two identical subunits with a/p barrel folds, 
(b) HIV-1 aspartic protease [PDB 1DIF]. This protein has two identical all-/3 subunits that bind symmetrically. HIV protease is the target of many new 
drugs designed to treat AIDS patients, (c) Streptomyces lividans potassium channel protein [PDB 1BL8]. This membrane-bound protein has four 
identical subunits, each of which contributes to a membrane-spanning eight-helix bundle, (d) Bacteriophage MS2 capsid protein [PDB 2MS2]. The 
basic unit of the MS2 capsid is a trimer of identical subunits with a large p sheet, (e) Human ( Homo sapiens) hypoxanthine-guanine phosphoribosyl 
transferase (HGPRT) [PDB 1BZY]. HGPRT is a tetrameric protein containing two different types of subunit, (f) Rhodopseudomonas viridis photosys- 
tem [PDB 1PRC]. This complex, membrane-bound protein has two identical subunits (orange, blue) and two other subunits (purple, green) bound to 
several molecules of photosynthetic pigments. 

108 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Table 4.1 Natural occurrence of oligomeric proteins in Escherichia coli 



Number of 

Number of 

heterooligomers Percent 
















































Higher oligomers 






Figure 4.27 ► 

Large protein complexes in the bacterium 
Mycoplasma pneumoniae. M. pneumoniae 
causes some forms of pneumonia in hu- 
mans. This species has one of the smallest 
genomes known (689 protein-encoding 
genes). Most of those genes are likely to 
represent the minimum proteome of a living 
cell. The cell contains several large com- 
plexes found in all cells: pyruvate dehygro- 
genase (purple), ribosome (yellow), GroEL 
(red), and RNA polymerase (orange). It also 
contains a rod (green) found only in some 
bacteria. [Adapted from Kuhner et al. (2009). 
Proteome organization in a genome-reduced 
bacterium. Science 326:1235-1240] 

heterooligomers such as hypoxanthine-guanine phosphoribosyl transferase (HGPRT, 
Figure 4.26e) and hemoglobin (Section 4.14). Of course, there are many exceptions, es- 
pecially when the oligomers are large complexes. 

We will encounter many other examples of multisubunit proteins throughout this 
textbook, especially in the chapters on information flow (Chapters 20-22). DNA poly- 
merase, RNA polymerase, and the ribosome are excellent examples. Other examples in- 
clude GroEL (Section 4.1 ID) and pyruvate dehydrogenase (Section 13.1). Many of 
these large proteins are easily seen in electron micrographs, as illustrated in Figure 4.27. 

Large complexes are referred to, metaphorically, as protein machines since the vari- 
ous polypeptide components work together to carry out a complex reaction. The term 

structural core 



4.9 Protein-Protein Interactions 109 

was originally coined to describe complexes such as the replisome (Figure 20.15) 
but there are many other examples, including those shown in Figure 4.27. 

The bacterial flagellum (Figure 4.28) is a spectacular example of a protein 
machine. The complex drives the rotation of a long flagellum using protonmo- 
tive force as an energy source (Section 14.3). More than 50 genes are required to 
build the flagellum in E. coli but surveys of other bacteria reveal that there are 
only about 2 1 core proteins required to build a functional flagellum. The evolu- 
tionary history of this protein machine is being actively investigated and it appears 
that it was built up by combining simpler components involved in ATP synthesis 
and membrane secretion. 

4.9 Protein-Protein Interactions 

The various subunits in multisubunit proteins bind to each other so strongly that 
they rarely dissociate inside the cell. These protein-protein contacts are character- 
ized by a number of weak interactions. We have already become familiar with the 
type of interactions involved: hydrogen bonds, charge-charge interactions, van der ^ 

Waals forces, and hydrophobic interactions (Section 2.5). In some cases the contact 
areas between two subunits are localized to small patches on the surface of the 
polypeptides but while in other cases there can be extensive contact spread over 
large portions of the polypeptides. The distinguishing feature of subunit contacts 
is the cumulative effect of a large number of individual weak interactions giving a 
binding strength that is sufficient to keep the subunits together. 

In addition to subunit-subunit contacts, there are many other types of protein- 
protein interactions that are less stable. These range from transient contacts between 
external proteins and receptors on the cell surface to weak interactions between various 
enzymes in metabolic pathways. These weak interactions are much more difficult to detect 
but they are essential components of many biochemical reactions. 

Consider a simple interaction between two proteins, PI and P2, to give a complex 
P1:P2. The equilibrium between the free and bound molecules can be described by either 
an association constant (IQ) or a dissociation constant (IQ) (IQ = 1/IQ). 

ament cap 

F, 9 L ] Hooh-lilament 
FWCl iunrtxn 

PI + P2 PI :P2 

K a = 

[PI :P2] 



PI :P2 PI + P2 

Kd = 



[PI :P2] 

FliK 1 


*[ FlgG [ Dislalrod 

-j FlgH | L ring 

■j Flgl J P ring 

- FliE FlgB 

FlgG Proximal rod 

F ItF 

MS ring 

3 FUG 

r— FliM 

1 C ring 

“} ' FUN 





▲ Figure 4.28 

Bacterial flagellum. The bacterial flagellum is 
a protein machine composed of 21 core 
subunits found in all species (blue boxes). 
Two additional subunits are missing in 
Firmicutes (white boxes) and five others are 
sporadically distributed. The flagellum (hook 
+ filament + cap) spins as the motor complex 
rotates. The three layers represent the outer 
membrane (top), the peptidoglycan layer 
(middle), and the cytoplasmic membrane 
(bottom). (Courtesy of Howard Ochman.) 

Typical association constants for the binding of subunits in a multimeric protein are 
greater than 10 8 M -1 (IQ > 10 8 M -1 ) and can range as high as 10 14 M -1 for very tight 
interactions. At the other extreme are protein-protein interactions that are so weak they 
have no biological significance. These can be fortuitous interactions that arise from time 
to time because any two polypeptides will almost always form some kind of weak con- 
tact. The lower limit of relevant association constants is about 10 4 M -1 (IQ < 10 4 M -1 ). The 
really interesting cases are those with association constants between these two values. 

The binding of transcription factors to RNA polymerase is one example of weak 
protein-protein interactions that are very important. The association constants range 
from about 10 5 M -1 to 10 7 M -1 . The interactions between proteins in signaling pathways 
also fall into this range as do the interactions between enzymes in metabolic pathways. 

Let’s look at what these association constants mean in terms of protein concen- 
trations. As the concentrations of PI and P2 increase it becomes more and more 
likely that they will interact and bind to each other. At some concentration, the rate of 
binding (a second-order reaction) becomes comparable to the rate of dissociation (a 
first-order reaction) and complexes will be present in appreciable amounts. Using the 
association constant, we can calculate the ratio of free polypeptide (PI or P2) as a 
fraction of the total concentration of either one (PQ or P2 T ). This ratio [free] /[total] 
tells us how much of the complex will be present at a given protein concentration. 

110 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Figure 4.29 ► 

Association constants and protein concentration. 

The ratio of free unbound protein to total 
protein is shown for a protein-protein inter- 
action at three different association constants. 
Assuming that the concentration of the other 
component is in excess, the concentrations 
at which half the molecules are in complex 
and half are free corresponds to the recip- 
rocal of the association constant. [Adapted 
from van Holde, Johnson, and Ho, Principles 
of Physical Biochemistry, Prentice Hall.] 



The curves in Figure 4.29 show these ratios for three different association constants 
corresponding to very weak (X a = 10 4 M _1 ), moderate (X a = 10 6 M _1 ), and very strong 
(FC a = 10 8 M -1 ) protein-protein interactions. If we assume that one of the components 
is present in excess, then the curves represent the concentrations of only the rate-limit- 
ing polypeptide. One can demonstrate mathematically that for simple systems the point 
at which half of the polypeptide is free and half is in a complex corresponds to the re- 
ciprocal of the association constant. For example, if K a = 10 8 M -1 then most of the 
polypeptide will be bound at any concentration over 1CT 8 M. 

What does this mean in terms of molecules per cell? For an E. coli cell whose 
volume is about 2 x 10 -15 1 it means that as long as there are more than a dozen mole- 
cules per cell the complex will be stable if K a > 10 8 M _1 . This is why large oligomeric 
complexes can exist in E. coli even if there are only a few dozen per cell. Most eukaryotic 
cells are 1000 times larger and there must be 12,000 molecules in order to achieve a con- 
centration of 10 -8 M. Figure 4.29 also shows why it is impossible for weak interactions 
to produce significant numbers of P1:P2 complexes. The protein concentration has to 
be greater than 10~ 4 M in order for the complex to be present in significant quantity 
and this concentration corresponds to 120,000 molecules in an E. coli cell or 120 million 
molecules in a eukaryotic cell. There are no free polypeptides present at such concentra- 
tions so weak interactions of this magnitude are biologically meaningless. 

There are many techniques for detecting moderate binding. These include direct 
techniques such as affinity chromatopraphy, immunoprecipitation, and chemical cross- 
linking. Newer techniques rely on more sophisticated manipulations such as phage dis- 
play, two-hybrid analysis, and genetic methods. Many workers are attempting to map 
the interactions of every protein in the cell using these techniques. An example of such 
an “interactome” for many E. coli proteins is shown in Figure 4.30. Note that strong in- 
teractions between the subunits of oligomers are easily detected as shown by lines con- 
necting the subunits of RNA polymerase, the ribosome, and DNA polymerase. Other 
lines connect RNA polymerase to various transcription factors — these represent mod- 
erate interactions. Further studies of the “interactome” in various species should give us 
a much better picture of the complex protein-protein interactions in living cells. 

4.10 Protein Denaturation and Renaturation 

Environmental changes or chemical treatments may disrupt the native conformation of 
a protein causing loss of biological activity. Such a disruption is called denaturation. The 
amount of energy needed to cause denaturation is often small, perhaps equivalent to 
that needed for the disruption of three or four hydrogen bonds. Some proteins may unfold 
completely when denatured to form a random coil (a fluctuating chain considered to be 
totally disordered) but most denatured proteins retain considerable internal structure. 
It is sometimes possible to find conditions under which small denatured proteins can 
spontaneously renature, or refold, following denaturation. 

4.10 Protein Denaturation and Renaturation 111 

◄ Figure 4.30 

E. coli interactome. Each point on the dia- 
gram represents a single E. coli protein. Red 
dots are essential proteins and blue dots are 
nonessential proteins. Lines joining the 
points indicate experimentally determined 
protein-protein interactions. Five large com- 
plexes are shown: RNA polymerase, DNA 
polymerase, ribosome and associated pro- 
teins, proteins interacting with cysteine 
desulfurase (IscS), and proteins associated 
with acyl carrier protein (ACP). (The role of 
ACP is described in Section 16.1.) 

[Adapted from Butland et al. (2005)] 

Proteins are commonly denatured by heating. Under the appropriate conditions, 
a modest increase in temperature will result in unfolding and loss of secondary and 
tertiary structure. An example of thermal denaturation is shown in Figure 4.31. In this 
experiment, a solution containing bovine ribonuclease A is heated slowly and the struc- 
ture of the protein is monitored by various techniques that measure changes in confor- 
mation. All of these techniques detect a change when denaturation occurs. In the case of 
bovine ribonuclease A, thermal denaturation also requires a reducing agent that dis- 
rupts internal disulfide bridges allowing the protein to unfold. 

Denaturation takes place over a relatively small range of temperature. This indi- 
cates that unfolding is a cooperative process where the destabilization of just a few weak 
interactions leads to almost complete loss of native conformation. Most proteins have a 
characteristic “melting” temperature (T m ) that corresponds to the temperature at the 
midpoint of the transition between the native and denatured forms. The T m depends on 
pH and the ionic strength of the solution. 

Most proteins are stable at temperatures up to 50°C to 60°C under physiological 
conditions. Some species of bacteria, such as those that inhabit hot springs and the 
vicinity of deep ocean thermal vents, thrive at temperatures well above this range. Pro- 
teins in these species denature at much higher temperatures as expected. Biochemists 
are actively studying these proteins in order to determine how they resist denaturation. 

Proteins can also be denatured by two types of chemicals — chaotropic agents and 
detergents (Section 2.4). High concentrations of chaotropic agents, such as urea and 
guanidinium salts (Figure 4.32), denature proteins by allowing water molecules to solvate 
nonpolar groups in the interior of proteins. The water molecules disrupt the hydrophobic 
interactions that normally stabilize the native conformation. The hydrophobic tails of 

Figure 4.31 ► 

Heat denaturation of ribonuclease A. A solution of ribonuclease A in 0.02 M KOI at pH 2.1 was 
heated. Unfolding was monitored by changes in ultraviolet absorbance (blue), viscosity (red), and 
optical rotation (green). The y-axis is the fraction of the molecule unfolded at each temperature. 
[Adapted from Ginsburg, A., and Carroll, W. R. (1965). Some specific ion effects on the conformation 
and thermal stability of ribonuclease. Biochemistry 4:2159-2174. 

- i ■ i ■ i ■ I 

0 10 20 30 40 50 

Temperature (°C) 

112 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 



h 2 n nh 2 



NH 2 c| © 

h 2 n nh 2 

Guanidinium chloride 
▲ Figure 4.32 

Urea and guanidinium chloride. 

▲ Figure 4.33 

Disulfide bridges in bovine ribonuclease A. (a) Location of disulfide bridges in the native protein, 
(b) View of the disulfide bridge between Cys-26 and Cys-84 [PDB 2AAS]. 

The numbering convention for amino 
acid residues in a polypeptide starts at 
the N-terminal end (Section 3.5). Cys-26 
is the 26th residue from the N-terminus. 

detergents, such as sodium dodecyl sulfate (Figure 2.8), also denature proteins by pene- 
trating the protein interior and disrupting hydrophobic interactions. 

The native conformation of some proteins (e.g., ribonuclease A) is stabilized by 
disulfide bonds. Disulfide bonds are not generally found in intracellular proteins but are 
sometimes found in proteins that are secreted from cells. The presence of disulfide bonds 
stabilizes proteins by making them less susceptible to unfolding and subsequent degra- 
dation when they are exposed to the external environment. Disulfide bond formation 
does not drive protein folding; instead, the bonds form where two cysteine residues are 
appropriately located once the protein has folded. Formation of a disulfide bond requires 
oxidation of the thiol groups of the cysteine residues (Figure 3.4), probably by disulfide- 
exchange reactions involving oxidized glutathione, a cysteine -containing tripeptide. 

Figure 4.33a shows the locations of the disulfide bridges in ribonuclease A. (Com- 
pare this orientation of the protein with that shown in Figure 4.3.) There are four disul- 
fide bridges. They can link adjacent (3 strands, (3 strands to a helices, or (3 strands to 
loops. Figure 4.33b is a view of the disulfide bridge between a cysteine residue in an 
a helix (Cys-26) and a cysteine residue in a (3 strand (Cys-84). Note that the S — S bond 
does not align with the cysteine side chains. Disulfide bridges will form whenever the 
two cysteine sulfhydryl groups are in close proximity in the native conformation. 

Complete denaturation of proteins containing disulfide bonds requires cleavage of 
these bonds in addition to disruption of hydrophobic interactions and hydrogen bonds. 
2-Mercaptoethanol or other thiol reagents can be added to a denaturing medium in 
order to reduce any disulfide bonds to sulfhydryl groups (Figure 4.34). Reduction of the 
disulfide bonds of a protein is accompanied by oxidation of the thiol reagent. 

In a series of classic experiments, Christian B. Anfinsen and his coworkers studied 
the renaturation pathway of ribonuclease A that had been denatured in the presence of 
thiol reducing agents. Since ribonuclease A is a relatively small protein (124 amino acid 

Figure 4.34 ► 

Cleaving disulfide bonds. When a protein 
is treated with excess 2-mercaptoethanol 
(HSCH 2 CH 2 OH), a disulfide-exchange reac- 
tion occurs in which each cystine residue 
is reduced to two cysteine residues and 
2-mercaptoethanol is oxidized to a disulfide. 

H O 

W/V N — CH — C WV 

H O 

w/v N — CH — C WV 

2 HSCH 2 CH 2 OH^ 


s — CH 2 CH 2 OH 
s — ch 2 ch 2 oh 

WV N — CH — C 'xrx/xr 

H O 

WV |\| £ |— | £ WV 

H O 

Cystine residue 

Cysteine residues 

4.10 Protein Denaturation and Renaturation 113 

residues), it refolds (renatures) quickly once it is returned to conditions where the native 
form is stable (e.g., cooled below the melting temperature or removed from a solution 
containing chaotropic agents). Anfinsen was among the first to show that denatured 
proteins can refold spontaneously to their native conformation indicating that the in- 
formation required for the native three-dimensional conformation is contained in the 
amino acid sequence of the polypeptide chain. In other words, the primary structure 
determines the tertiary structure. 

Denaturation of ribonuclease A with 8 M urea containing 2-mercaptoethanol re- 
sults in complete loss of tertiary structure and enzymatic activity and yields a polypep- 
tide chain containing eight sulfhydryl groups (Figure 4.35). When 2-mercaptoethanol is 
removed and oxidation is allowed to occur in the presence of urea, the sulfhydryl groups 
pair randomly so that only about 1% of the protein population forms the correct four 
disulfide bonds recovering original enzymatic activity. (If the eight sulfhydryl groups 
pair randomly, 105 disulfide-bonded structures are possible — 7 possible pairings for the 
first bond, 5 for the second, 3 for the third, and 1 for the fourth (7x5x3xl = 105) — 
but only one of these structures is correct.) However, when urea and 2-mercaptoethanol 
are removed simultaneously and dilute solutions of the reduced protein are then exposed 
to air, ribonuclease A spontaneously regains its native conformation, its correct set of 
disulfide bonds, and its full enzymatic activity. The inactive proteins containing ran- 
domly formed disulfide bonds can be renatured if urea is removed, a small amount of 2- 
mercaptoethanol is added, and the solution gently warmed. Anfinsens experiments 
demonstrate that the correct disulfide bonds can form only after the protein folds into its 
native conformation. Anfinsen concluded that the renaturation of ribonuclease A is 
spontaneous, driven entirely by the free energy gained in changing to the stable physio- 
logical conformation. This conformation is determined by the primary structure. 

Proteins occasionally adopt a nonnative conformation and form inappropriate 
disulfide bridges when they fold inside a cell. Anfinsen discovered an enzyme, called 
protein disulfide isomerase (PDI), that catalyzes reduction of these incorrect bonds. All 

▲ Christian B. Anfinsen (1916-1995). 

Anfinsen was awarded the Nobel Prize 
in Chemistry in 1972 for his work on the 
refolding of proteins. 

disulfide bonds have been reduced 

◄ Figure 4.35 

Denaturation and renaturation of ribonuclease A. 

Treatment of native ribonuclease A (top) with 
urea in the presence of 2-mercaptoethanol 
unfolds the protein and disrupts disulfide 
bonds to produce reduced, reversibly dena- 
tured ribonuclease A (bottom). When the 
denatured protein is returned to physiological 
conditions in the absence of 2-mercap- 
toethanol, it refolds into its native conforma- 
tion and the correct disulfide bonds form. 
However, when 2-mercaptoethanol alone is 
removed, ribonuclease A reoxidizes in the 
presence of air, but the disulfide bonds form 
randomly, producing inactive protein (such 
as the form shown on the right). When urea 
is removed, a trace of 2-mercaptoethanol 
is added to the randomly reoxidized protein, 
and the solution is warmed gently, the disul- 
fide bonds break and re-form correctly to 
produce native ribonuclease A. 

114 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 






▲ Figure 4.36 

Energy well of protein folding. The funnels 
represent the free-energy potential of folding 
proteins, (a) A simplified funnel showing 
two possible pathways to the low-energy 
native protein. In path B, the polypeptide 
enters a local low-energy minimum as it 
folds, (b) A more realistic version of the pos- 
sible free-energy forms of a folding protein 
with many local peaks and dips. 


Most proteins fold spontaneously into a 
conformation with the lowest energy. 

living cells contain such an activity. The enzyme contains two reduced cysteine residues 
positioned in the active site. When the misfolded protein binds, the enzyme catalyzes a 
disulfide -exchange reaction whereby the disulfide in the misfolded protein is reduced 
and a new disulfide bridge is created between the two cysteine residues in the enzyme. 
The misfolded protein is then released and it can refold into the low-energy native 
conformation. The structure of the reduced form of E. coli disulfide isomerase (DsbA) 
is shown in Figure 4.24o. 

4.11 Protein Folding and Stability 

New polypeptides are synthesized in the cell by a translation complex that includes 
ribosomes, mRNA, and various factors (Chapter 21). As the newly synthesized polypep- 
tide emerges from the ribosome, it folds into its characteristic three-dimensional shape. 
Folded proteins occupy a low-energy well that makes the native structure much more 
stable than alternative conformations (Figure 4.36). The in vitro experiments of Anfmsen 
and many other biochemists demonstrate that many proteins can fold spontaneously to 
reach this low-energy conformation. In this section we discuss the characteristics of 
those proteins that fold into a stable three-dimensional structure. 

It is thought that as a protein folds the first few interactions trigger subsequent 
interactions. This is an example of cooperative effects in protein folding — the phenom- 
enon whereby the formation of one part of a structure leads to the formation of the 
remaining parts of the structure. As the protein begins to fold, it adopts lower and lower 
energies and begins to fall into the energy well shown in Figure 4.36. The protein may 
become temporarily trapped in a local energy well (shown as small dips in the energy 
diagram) but eventually it reaches the energy minimum at the bottom of the well. In its 
final, stable, conformation, the native protein is much less sensitive to degradation than 
an extended, unfolded polypeptide chain. Thus, native proteins can have half-lives of 
many cell generations and some molecules may last for decades. 

Folding is extremely rapid — in most cases the native conformation is reached in 
less than a second. Protein folding and stabilization depend on several noncovalent 
forces including the hydrophobic effect, hydrogen bonding, van der Waals interactions, 
and charge-charge interactions. Although noncovalent interactions are weak individu- 
ally, collectively they account for the stability of the native conformations of proteins. 
The weakness of each noncovalent interaction gives proteins the resilience and flexibil- 
ity to undergo small conformational changes. (Covalent disulfide bonds also contribute 
to the stability of certain proteins.) 

In multidomain proteins the different domains fold independently of one another 
as much as possible. One of the reasons for limitations on the size of a domain (usually 
< 200 residues) is that large domains would fold too slowly if domains were larger than 
300 residues. The rate of spontaneous folding would be too slow to be useful. 

No actual protein-folding pathway has yet been described in detail but current re- 
search is focused on intermediates in the folding pathways of a number of proteins. Sev- 
eral hypothetical folding pathways are shown in Figure 4.37. During protein folding, the 
polypeptide collapses upon itself due to the hydrophobic effect and elements of second- 
ary structure begin to form. This intermediate is called a molten globule. Subsequent 
steps involve rearrangement of the backbone chain to form characteristic motifs and, fi- 
nally, the stable native conformation. 

The mechanism of protein folding is one of the most challenging problems in bio- 
chemistry. The process is spontaneous and must be largely determined by the primary 
structure (sequence) of the polypeptide. It should be possible, therefore, to predict the 
structure of a protein from knowledge of its amino acid sequence. Much progress has 
been made in recent years by modeling the folding process using fast computers. 

In the remainder of this section, we examine the forces that stabilize protein struc- 
ture in more detail. We will also describe the role of chaperones in protein folding. 

A. The Hydrophobic Effect 

Proteins are more stable in water when their hydrophobic side chains are aggregated in 
the protein interior rather than exposed on the surface to the aqueous medium. Because 

4.11 Protein Folding and Stability 115 

◄ Figure 4.37 

Hypothetical protein-folding pathways. The 

initially extended polypeptide chains form 
partial secondary structures, then approxi- 
mate tertiary structures, and finally the 
unique native conformations. The arrows 
within the structures indicate the direction 
from the N- to the C-terminus. 

water molecules interact more strongly with each other than with the nonpolar side 
chains of a protein, the side chains are forced to associate with one another causing 
the polypeptide chain to collapse into a more compact molten globule. The entropy 
of the polypeptide decreases as it becomes more ordered. This decrease is more than 
offset by the increase in solvent entropy as water molecules that were previously bound 
to the protein are released. (Folding also disrupts extended cages of water molecules 
surrounding hydrophobic groups.) This overall increase in the entropy of the system 
provides the major driving force for protein folding. 

Whereas nonpolar side chains are driven into the interior of the protein, most 
polar side chains remain in contact with water on the surface of the protein. The sec- 
tions of the polar backbone that are forced into the interior of a protein neutralize their 
polarity by hydrogen bonding to each other, often generating secondary structures. 
Thus, the hydrophobic nature of the interior not only accounts for the association of 
hydrophobic residues but also contributes to the stability of helices and sheets. Studies 
of folding pathways indicate that hydrophobic collapse and formation of secondary 
structures occur simultaneously 

Localized examples of this hydrophobic effect are the interactions of the hydropho- 
bic side of an amphipathic a helix with the protein core (Section 4.4) and the hy- 
drophobic region between (3 sheets in the /3-sandwich structure (Section 4.5). Most of 
the examples shown in Figures 4.25 and 4.26 contain juxtaposed regions of secondary 
structure that are stabilized by hydrophobic interactions between the side chains of 
hydrophobic amino acid residues. 

B. Hydrogen Bonding 

Hydrogen bonds contribute to the cooperativity of folding and help stabilize the native 
conformations of proteins. The hydrogen bonds in a helices, (3 sheets, and turns are the 
first to form, giving rise to defined regions of secondary structure. The final native 
structure also contains hydrogen bonds between the polypeptide backbone and water, 
between the polypeptide backbone and polar side chains, between two polar side 
chains, and between polar side chains and water. Table 4.2 shows some of the many 
types of hydrogen bonds found in proteins along with their typical bond lengths. Most 
hydrogen bonds in proteins are of the N — H — O type. The distance between the donor 
and acceptor atoms varies from 0.26 to 0.34 nm and the bonds may deviate from linear- 
ity by up to 40°. Recall that hydrogen bonds within the hydrophobic core of a protein 
are much more stable than those that form near the surface because the internal hydro- 
gen bonds don’t compete with water molecules. 


Entropically driven reactions are reactions 
where the most important thermodynamic 
change is an increase in entropy of the 
system. We can say that the system is 
much more disordered at the end of the 
reaction than at the beginning. In the case 
of hydrophobic interactions, the change 
in entropy is mostly due to the release 
of ordered water molecules that shield 
hydrophobic groups (Section 2.5D). 


CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Table 4.2 Examples of hydrogen bonds in proteins 

Type of 

hydrogen bond 

Typical distance between 
donor and acceptor 
atom (nm) 

Hydroxyl -hydroxyl 

— O— H 

— o— 




— O— H 


o = c 





N— H- 





/ \ 




N— H- 




Amide-imidazole nitrogen 


N— H- 






The basic principles of protein folding are reasonably well 
understood and it seems certain that if a protein has a sta- 
ble three-dimensional structure it will be determined largely 
by the primary structure (sequence). This has led to efforts to 
predict tertiary structure from knowing the amino acid 
sequence. Biochemists have made huge advances in this the- 
oretical work in the last 30 years. 

The value of such work has to be assessed by making 
predictions of the structure of unknown proteins. This led in 
1996 to the beginning of CASP-Critical Assessment of 
Methods of Protein Structure Prediction. This is a sort of 
game with no prizes other than the honor of being success- 
ful. Protein folding groups are given the amino acid se- 
quences of a number of targets and asked to predict the 
three-dimensional structure. The targets are drawn from 

those proteins whose structures have just been determined 
but the data haven’t yet been published. Contestants have 
only a few weeks to send in their predictions before the actual 
structures become known. 

The results of the 2008 CASP round are shown in the 
figure. There were 121 targets and thousands of predictions 
were submitted. Success ranged from nearly 100% for easy 
proteins to only about 30% for difficult ones. (“Easy” targets 
are those where the Protein Data Bank (PDB) already con- 
tains the structures of several homologous proteins. “Diffi- 
cult” targets are proteins with new folds that have never been 
solved.) The success rate for moderately difficult targets has 
climbed over the years as the prediction methods improved, 
but there’s plenty of opportunity to make winning predic- 
tions at the very difficult end of the scale. 


Target difficulty 


4.11 Protein Folding and Stability 


C. Van der Waals Interactions and Charge-Charge Interactions 

Van der Waals contacts between nonpolar side chains also contribute to the stability of 
proteins. The extent of stabilization due to optimized van der Waals interactions is dif- 
ficult to determine. The cumulative effect of many van der Waals interactions probably 
makes a significant contribution to stability because nonpolar side chains in the interior 
of a protein are densely packed. 

Charge-charge interactions between oppositely charged side chains may make a small 
contribution to the stability of proteins but most ionic side chains are found on the surfaces 
where they are solvated and can contribute only minimally to the overall stabilization of 
the protein. Nevertheless, two oppositely charged ions occasionally form an ion pair in the 
interior of a protein. Such ion pairs are much stronger than those exposed to water. 

D. Protein Folding Is Assisted by Molecular Chaperones 

Studies of protein folding have led to two general observations regarding the folding of 
polypeptide chains into biologically active proteins. First, protein folding does not in- 
volve a random search in three-dimensional space for the native conformation. Instead, 
protein folding appears to be a cooperative, sequential process in which formation of 
the first few structural elements assists in the alignment of subsequent structural fea- 
tures. [The need for cooperativity is illustrated by a calculation made by Cyrus 
Levinthal. Consider a polypeptide of 100 residues. If each residue had three possible 
conformations that could interconvert on a picosecond time scale then a random search 
of all possible conformations for the complete polypeptide would take 10 87 seconds — 
many times the estimated age of the universe (6 x 10 17 seconds)!] 

Second, to a first approximation the folding pattern and the final conformation of a 
protein depend on its primary structure. (Many proteins bind metal ions and coenzymes 
as described in Chapter 7. These external ligands are also required for proper folding.) As 
we saw in the case of ribonuclease A, simple proteins may fold spontaneously into their 
native conformations in a test tube without any energy input or assistance. Larger proteins 
will also fold spontaneously into their native structures since the final conformation rep- 
resents the minimal free energy form. However, larger proteins are more likely to become 
temporarily trapped in a local energy well of the type illustrated in Figure 4.36b. The pres- 
ence of such metastable incorrect conformations at best slows the rate of protein folding 
and at worst causes the folding intermediates to aggregate and fall out of solution. In 
order to overcome this problem inside the cell, the rate of correct protein folding is en- 
hanced by a group of ubiquitous special proteins called molecular chaperones. 

Chaperones increase the rate of correct folding of some proteins by binding newly 
synthesized polypeptides before they are completely folded. They prevent the formation 
of incorrectly folded intermediates that may trap the polypeptide in an aberrant form. 
Chaperones can also bind to unassembled protein subunits to prevent them from ag- 
gregating incorrectly and precipitating before they are assembled into a complete multi- 
subunit protein. 

There are many different chaperones. Most of them are heat shock proteins — pro- 
teins that are synthesized in response to temperature increases (heat shock) or other changes 
that cause protein denaturation in vivo. The role of heat shock proteins — now recognized 
as chaperones — is to repair the damage caused by temperature increases by binding to dena- 
tured proteins and helping them to refold rapidly into their native conformation. 

The major heat shock protein is Hsp70 (heat shock protein, M r = 70,000). This 
protein is present in all species except for some species of archaebacteria. In bacteria, it 
is also called DnaK. The normal role of the chaperone Hsp70 is to bind to nascent 

► Heat shock proteins. Proteins were synthesized for a short time in the presence of radioactive 
amino acids then run on an SDS-polyacrylamide gel. The gel was exposed to film to detect radioactive 
proteins. The resulting autoradiograph shows only those proteins that were labeled during the time 
of exposure to radioactive amino acids. Lanes “C” are proteins synthesized at normal growth tem- 
peratures, and lanes “H” are proteins synthesized during a short heat shock where cells are shifted 
to a temperature a few degrees above their normal growth temperature. The induction of heat 
shock proteins (chaperones) in four different species is shown. Red dots indicate major heat shock 
proteins: top = Hsp90, middle = Hsp70, bottom = Hsp60(GroEL). 

i i 

C H 


118 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

► Figure 4.38 

Escherichia coli chaperonin (GroE). The core 
structure consists of two identical rings 
composed of seven GroEL subunits. Un- 
folded proteins bind to the central cavity. 
Bound ATP molecules can be identified by 
their red oxygen atoms, (a) Side view, (b) 
Top view showing the central cavity. [PDB 
1DER]. (c) During folding the size of the 
central cavity of one of the rings increases 
and the end is capped by a protein contain- 
ing seven GroES subunits. [PDB 1AON]. 

proteins while they are being synthesized in order to prevent aggregation or entrapment 
in a local low-energy well. The binding and release of nascent polypeptides is coupled to 
the hydrolysis of ATP and usually requires additional accessory proteins. Hsp70/DnaK 
is one of the most highly conserved proteins known in all of biology. This indicates that 
chaperone-assisted protein folding is an ancient and essential requirement for efficient 
synthesis of proteins with the correct three-dimensional structure. 

Another important and ubiquitous chaperone is called chaperonin (also called 
GroE in bacteria). Chaperonin is also a heat shock protein (Hsp60) that plays an impor- 
tant and essential role in assisting normal protein folding inside the cell. 

E. coli chaperonin is a complex multisubunit protein. The core structure consists of 
two rings containing seven identical GroEL subunits. Each subunit can bind a molecule 
of ATP (Figure 4.38a). A simplified version of chaperonin-assisted folding is shown in 
Figure 4.39 . Unfolded proteins bind to the hydrophobic central cavity enclosed by the 
rings. When folding is complete, the protein is released by hydrolysis of the bound ATP 
molecules. The actual pathway is more complicated and requires an additional component 
that serves as a cap sealing one end of the central cavity while the folding process takes place. 

Figure 4.39 ► 

Chaperonin-assisted protein folding. The un- 
folded polypeptide enters the central cavity 
of chaperonin, where it folds. The hydrolysis 
of several ATP molecules is required for 
chaperonin function. 


4.12 Collagen, a Fibrous Protein 


The cap contains seven GroES subunits forming an additional ring (Figure 4.38c). The 
conformation of the GroEL ring can be altered during folding to increase the size of the 
cavity and the role of the cap is to prevent the unfolded protein from being released 

As mentioned earlier, some proteins tend to aggregate during folding in the absence 
of chaperones. Aggregation is probably due to temporary formation of hydrophobic sur- 
faces on folding intermediates. The intermediates bind to each other and the result is that 
they are taken out of solution and are no longer able to explore the conformations repre- 
sented by the energy funnel shown in Figure 4.36. Chaperonins isolate polypeptide 
chains in the folding cavity and thus prevent folding intermediates from aggregating. 
The folding cavity serves as an “Anfinsen cage” that allows the chain to reach the correct 
low-energy conformation without interference from other folding intermediates. 

The central cavity of chaperonin is large enough to accommodate a polypeptide 
chain of about 630 amino acid residues (M r = 70,000). Thus, the folding of most small 
and medium-sized proteins can be assisted by chaperonin. However, only about 5% to 
10% of E. coli proteins (i.e., about 300 different proteins) appear to interact with chap- 
eronin during protein synthesis. Medium-sized proteins and those of the a/ (3 structural 
class are more likely to require chaperonin-assisted folding. Smaller proteins are able to 
fold quickly on their own. Many of the remaining proteins in the cell require other 
chaperones, such as HSP70/DnaK. 

Chaperones appear to inhibit incorrect folding and assembly pathways by forming 
stable complexes with surfaces on polypeptide chains that are exposed only during syn- 
thesis, folding, and assembly. Even in the presence of chaperones, protein folding is 
spontaneous; for this reason, chaperone-assisted protein folding has been described as 
assisted self-assembly. 

4.12 Collagen, a Fibrous Protein 

To conclude our examination of the three-dimensional structure of proteins, we exam- 
ine several proteins to see how their structures are related to their biological functions. 
The proteins selected for more detailed study are the structural protein collagen, the 
oxygen-binding proteins myoglobin and hemoglobin (Sections 4.12 to 4.13), and anti- 
bodies (Section 4.14). 

Collagen is the major protein component of the connective tissue of vertebrates. It 
makes up about 30% of the total protein in mammals. Collagen molecules have 
remarkably diverse forms and functions. For example, collagen in tendons forms stiff, 
ropelike fibers of tremendous tensile strength whereas in skin, collagen takes the form 
of loosely woven fibers permitting expansion in all directions. 

The structure of collagen was worked out by G. N. Ramachandran (famous for his 
Ramachandran plots, Section 4.3). The molecule consists of three left-handed heli- 
cal chains coiled around each other to form a right-handed supercoil (Figure 4.40). 

▲ Figure 4.40 

The human type III collagen triple helix. The 

extended region of collagen contains three 
identical subunits (purple, light blue, and 
green). Three left-handed collagen helices 
are coiled around one another to form a 
right-handed supercoil. [PDB 1BKV] 

◄ G.N. Ramachandran (1922-2001). In this 
photograph he is illustrating the difference 
between an a helix and the left-handed 
triple helix of collagen. Note that he has de- 
liberately drawn the a helix as a left-handed 
helix and not the standard right-handed 
form found in most proteins. 

120 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 


'wv, |\| — Q\-\ — Q 'wv 

/ \ 

h 2 c. .ch 2 


X 0H 

▲ Figure 4.41 

4-Hydroxyproline residue. 4-Hydroxyproline 
residues are formed by enzyme-catalyzed 
hydroxylation of proline residues. 

The requirement for vitamin C is 
explained in Section 7.9. 

Figure 4.43 ► 

5-Hydroxylysine residue. 5-Hydroxylysine 
residues are formed by enzyme-catalyzed 
hydroxylation of lysine residues. 




H?C - 






S N 

/h 2 h 

'CH 2 

◄ Figure 4.42 

Interchain hydrogen bonding in collagen. The 

amide hydrogen of a glycine residue in one 
chain is hydrogen-bonded to the carbonyl 
oxygen of a residue, often proline, in an 
adjacent chain. 

Each left-handed helix in collagen has 3.0 amino acid residues per turn and a pitch of 
0.94 nm giving a rise of 0.31 nm per residue. Consequently, a collagen helix is more ex- 
tended than an a helix and the coiled-coil structure of collagen is not the same as the 
coiled-coil motif discussed in Section 4.7. (Several proteins unrelated to collagen also 
form similar three-chain supercoils.) 

The collagen triple helix is stabilized by interchain hydrogen bonds. The sequence of 
the protein in the helical region consists of multiple repeats of the form -Gly-X-Y-, where 
X is often proline and Y is often a modified proline called 4-hydroxyproline (Figure 4.41). 
The glycine residues are located along the central axis of the triple helix, where tight pack- 
ing of the protein strands can accommodate no other residue. For each -Gly-X-Y- triplet, 
one hydrogen bond forms between the amide hydrogen atom of glycine in one chain and 
the carbonyl oxygen atom of residue X in an adjacent chain (Figure 4.42). Hydrogen bonds 
involving the hydroxyl group of hydroxyproline may also stabilize the collagen triple helix. 
Unlike the more common a helix, the collagen helix has no intrachain hydrogen bonds. 

In addition to hydroxyproline, collagen contains an additional modified amino 
acid residue called 5-hydroxylysine (Figure 4.43). Some hydroxylysine residues are co- 
valently bonded to carbohydrate residues, making collagen a glycoprotein. The role of 
this glycosylation is not known. 

Hydroxyproline and hydroxylysine residues are formed when specific proline and 
lysine residues are hydroxylated after incorporation into the polypeptide chains of col- 
lagen. The hydroxylation reactions are catalyzed by enzymes and require ascorbic acid 
(vitamin C). Hydroxylation is impaired in the absence of vitamin C, and the triple helix 
of collagen is not assembled properly. 

The limited conformational flexibility of proline and hydroxyproline residues pre- 
vents the formation of a helices in collagen chains and also makes collagen somewhat 
rigid. (Recall that proline is almost never found in a helices.) The presence of glycine 
residues at every third position allows collagen chains to form a tightly wound left- 
handed helix that accommodates the proline residues. (Recall that the flexibility of 
glycine residues tends to disrupt the right-handed a helix.) 

Collagen triple helices aggregate in a staggered fashion to form strong, insoluble 
fibers. The strength and rigidity of collagen fibers result in part from covalent 


'N — CH — C — 

i i 

H CH 2 

oh 2 

CH — OH 


T 2 


4.12 Collagen, a Fibrous Protein 121 

( a ) ^ 

0 = C 

\a p 7 

CH — CH, — CH, 





+ H,N — CH, 






-ch 2 - 


Allysine residue 

Lysine residue 







0 = C C = 0 

la /3 7 8 e e 8 7 Pa I 

CH — CH 2 — CH 2 — CH 2 — CH = N — CH 2 — CH 2 — CH 2 — CH 2 — CH 




0 = C 

Schiff base 

H O 



c = o 

la P 7 8 s Is 7 Pa I 

CH — CH 2 — CH 2 — CH 2 — CH = C — CH 2 — CH 2 — CH 



cross-links. The — CH 2 NH 3 + groups of the side chains of some lysine and hydroxyly- 
sine residues are converted enzymatically to aldehyde groups ( — CHO), producing ally- 
sine and hydroxyallysine residues. Allysine residues (and their hydroxy derivatives) react 
with the side chains of lysine and hydroxylysine residues to form Schiff bases, complexes 
formed between carbonyl groups and amines (Figure 4.44a). These Schiff bases usually 
form between collagen molecules. Allysine residues also react with other allysine 
residues by aldol condensation to form cross-links, usually between the individual 
strands of the triple helix (Figure 4.44b). Both types of cross-links are converted to 
more stable bonds during the maturation of tissues, but the chemistry of these conver- 
sions is unknown. 


Not all fibrous proteins are composed of a helices. Silk is composed of a number of 
proteins that are predominantly / 3 strands. The dragline silk of the spider, Nephila 
clavipes , for example, contains two proteins called spidroin 1 and spidroin 2. Both 
proteins contain multiple stretches of alanine residues separated by residues that 
are mostly glycine. The structure of this silk is not known in spite of major efforts 
by many laboratories. However, it is known that the proteins contain extensive 
regions of / 3 strands. 

There are many different kinds of spider silk and spiders have specialized 
glands for each type. The silk fiber produced by the major ampulate gland is 
called dragline silk; it is the fiber that spiders use to drop out of danger or anchor 
their webs. This silk fiber is quite literally stronger than steel cable. Materials 
manufactured from dragline silk would be very useful in a number of applica- 
tions, one of which would be personal armor because dragline silk is stronger 
than Kevlar. So far it has not been possible to make significant amounts of silk in 
the laboratory without relying on spiders. 

Nephila clavipes , the golden silk spider. ► 

◄ Figure 4.44 

Covalent cross-links in collagen, (a) An ally- 
sine residue condenses with a lysine residue 
to form an intermolecular Schiff-base cross- 
link. (b) Two allysine residues condense to 
form an intramolecular cross-link. 

122 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

▲ Figure 4.45 

Chemical structure of the Fe(ll)-protoporphyrin 
IX heme group in myoglobin and hemoglobin. 

The porphyrin ring provides four of the six 
ligands that surround the iron atom. 

▲ Figure 4.46 

Sperm whale (Physeter catodon) oxymyoglobin. 

Myoglobin consists of eight a helices. The 
heme prosthetic group binds oxygen (red). 
His-64 (green) forms a hydrogen bond with 
oxygen, and His-93 (green) is complexed to 
the iron atom of the heme. [PDB 1A6M]. 

▲ John Kendrew’s original model of myoglo- 
bin determined from his X-ray diffraction 
data in the 1950s. The model is made of 
plasticine. It was the first three-dimensional 
model of a protein. 

4.13 Structures of Myoglobin and Hemoglobin 

Like most proteins, myoglobin (Mb) and the related protein hemoglobin (Hb) carry 
out their biological functions by selectively and reversibly binding other molecules — in 
this case, molecular oxygen (0 2 ). Myoglobin is a relatively small monomeric protein 
that facilitates the diffusion of oxygen in vertebrates. It is responsible for supplying oxy- 
gen to muscle tissue in reptiles, birds, and mammals. Hemoglobin is a larger tetrameric 
protein that carries oxygen in blood. 

The red color associated with the oxygenated forms of myoglobin and hemoglobin 
(e.g., the red color of oxygenated blood) is due to a heme prosthetic group (Figure 4.45). 
(A prosthetic group is a protein-bound organic molecule essential for the activity of the 
protein.) Heme consists of a tetrapyrrole ring system (protoporphyrin IX) complexed 
with iron. The four pyrrole rings of this system are linked by methene ( — CH=) 
bridges so that the unsaturated porphyrin is highly conjugated and planar. The bound 
iron is in the ferrous, or Fe®, oxidation state; it forms a complex with six ligands, four 
of which are the nitrogen atoms of protoporphyrin IX. (Other proteins, such as cy- 
tochrome a and cytochrome c, contain different porphyrin/heme groups.) 

Myoglobin is a member of a family of proteins called globins. The tertiary structure 
of sperm whale myoglobin shows that the protein consists of a bundle of eight a helices 
(Figure 4.46). It is a member of the all -a structural category. The globin fold has several 
groups of a helices that form a layered structure. Adjacent helices in each layer are tilted 
at an angle that allows the side chains of the amino acid residues to interdigitate. 

The interior of myoglobin is made up almost exclusively of hydrophobic amino 
acid residues, particularly those that are highly hydrophobic — valine, leucine, isoleucine, 
phenylalanine, and methionine. The surface of the protein contains both hydrophilic 
and hydrophobic residues. As is the case with most proteins, the tertiary structure of 
myoglobin is stabilized by hydrophobic interactions within the core. Folding of the 
polypeptide chain is driven by the energy minimization that results from formation of 
this hydrophobic core. 

The heme prosthetic group of myoglobin occupies a hydrophobic cleft formed by 
three a helices and two loops. The binding of the porphyrin moiety to the polypeptide is 
due to a number of weak interactions including hydrophobic interactions, van der Waals 
contacts, and hydrogen bonds. There are no covalent bonds between the porphyrin and 
the amino acid side chains of myoglobin. The iron atom of heme is the site of oxygen 
binding as shown in Figure 4.46. Two histidine residues interact with the iron atom and 
the bound oxygen. Accessibility of the heme group to molecular oxygen depends on 
slight movement of nearby amino acid side chains. We will see later that the hydrophobic 
crevices of myoglobin and hemoglobin are essential for the reversible binding of oxygen. 

In vertebrates, 0 2 is bound to molecules of hemoglobin for transport in red blood 
cells, or erythrocytes. Viewed under a microscope, a mature mammalian erythrocyte is a 
biconcave disk that lacks a nucleus or other internal membrane-enclosed compart- 
ments (Figure 4.47). A typical human erythrocyte is filled with approximately 3 X 10 8 
hemoglobin molecules. 

Hemoglobin is more complex than myoglobin because it is a multisubunit protein. 
In adult mammals, hemoglobin contains two different globin subunits called a-globin 
and (3-globin . Hemoglobin is an a 2 /3 2 tetramer — it contains two a chains and two 
/ 3 chains. Each of these globin subunits is similar in structure and sequence to myoglobin 
reflecting their evolution from a common ancestral globin gene in primitive chordates. 

Each of the four globin subunits contains a heme prosthetic group identical to that 
found in myoglobin. The a and [3 subunits face each other across a central cavity 
(Figure 4.48). The tertiary structure of each of the four chains is almost identical to that 
of myoglobin (Figure 4.49). The a chain has seven a helices, and the [3 chain has eight. 
(Two short a helices found in (3 - globin and myoglobin are fused into one larger one in 
a-globin) Hemoglobin, however, is not simply a tetramer of myoglobin molecules. Each 
a chain interacts extensively with a / 3 chain so hemoglobin is actually a dimer of a(3 sub- 
units. We will see in the following section that the presence of multiple subunits is respon- 
sible for oxygen-binding properties that are not possible with single- chain myoglobin. 

4.14 Oxygen Binding to Myoglobin and Hemoglobin 123 

▲ Figure 4.48 

Human {Homo sapiens) oxyhemoglobin, (a) Structure of human oxyhemoglobin showing two a and two 
/3 subunits. Heme groups are shown as stick models. [PDB 1HND]. (b) Schematic diagram of the 
hemoglobin tetramer. The heme groups are red. 

4.14 Oxygen Binding to Myoglobin and Hemoglobin 

The oxygen-binding activities of myoglobin and hemoglobin provide an excellent ex- 
ample of how protein structure relates to physiological function. These proteins are 
among the most intensely studied proteins in biochemistry. They were the first complex 
proteins whose structure was determined by X-ray crystallography (Section 4.2). A 
number of the principles described here for oxygen-binding proteins also hold true for 
the enzymes that we will study in Chapters 5 and 6. In this section we examine the 
chemistry of oxygen binding to heme, the physiology of oxygen binding to myoglobin 
and hemoglobin, and the regulatory properties of hemoglobin. 

A. Oxygen Binds Reversibly to Heme 

We will use myoglobin as an example of oxygen binding to the heme prosthetic group 
but the same principles apply to hemoglobin. The reversible binding of oxygen is called 
oxygenation. Oxygen- free myoglobin is called deoxy myoglobin and the oxygen-bearing 
molecule is called oxymyoglobin. (The two forms of hemoglobin are called deoxyhemoglobin 
and oxyhemoglobin.) 

Some substituents of the heme prosthetic group are hydrophobic — this feature 
allows the prosthetic group to be partially buried in the hydrophobic interior of the 
myoglobin molecule. Recall from Figure 4.46 that there are two polar residues, His-64 
and His -93, situated near the heme group. In oxymyoglobin, six ligands are coordinated 
to the ferrous iron, with the ligands in octahedral geometry around the metal cation 
(Figures 4.50 and 4.51). Four of the ligands are the nitrogen atoms of the tetrapyrrole ring 
system; the fifth ligand is an imidazole nitrogen from His- 93 (called the proximal histidine); 
and the sixth ligand is molecular oxygen bound between the iron and the imidazole side 
chain of His-64 (called the distal histidine). In deoxymyoglobin, the iron is coordinated to 
only five ligands because oxygen is not present. The nonpolar side chains of Val-68 and 
Phe-43, shown in Figure 4.51, contribute to the hydrophobicity of the oxygen-binding 
pocket and help hold the heme group in place. Several side chains block the entrance to the 
heme-containing pocket in both oxymyoglobin and deoxymyoglobin. The protein struc- 
ture in this region must vibrate, or breathe, rapidly to allow oxygen to bind and dissociate. 

The hydrophobic crevice of the globin polypeptide holds the key to the ability of myo- 
globin and hemoglobin to suitably bind and release oxygen. Free heme does not reversibly 
bind oxygen in aqueous solution; instead, the Fe© of the heme is almost instantly ox- 
idized to Fe©. (Oxidation is equivalent to the loss 

of an electron, as described in Section 6. 1C. Reduction is the gain of an electron. Oxida- 
tion and reduction refer to the transfer of electrons and not to the presence or absence of 
oxygen molecules.) 

v Figure 4.47 

Scanning electron micrograph of mammalian 
erythrocytes. Each cell contains approxi- 
mately 300 million hemoglobin molecules. 
The cells have been artificially colored. 

▲ Figure 4.49 

Tertiary structure of myoglobin, a-globin, and 
/Fglobin. The orientations of the individual 
a-globin and /3-globin subunits of hemoglo- 
bin have been shifted in order to reveal the 
similarities in tertiary structure. The three 
structures have been superimposed. All 
of the structures are from the oxygenated 
forms shown in Figures 4.46 and 4.48. 
Color code: a-globin (blue), /3-globin 
(purple), myoglobin (green). 

124 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 



▲ Figure 4.50 

Oxygen-binding site of sperm whale oxymyo- 
globin. The heme prosthetic group is repre- 
sented by a parallelogram with a nitrogen 
atom at each corner. The blue dashed lines 
illustrate the octahedral geometry of the 
coordination complex. 

▲ Figure 4.51 

The oxygen-binding site in sperm whale myo- 
globin. Fed I) (orange) lies in the plane of 
the heme group. Oxygen (green) is bound to 
the iron atom and the amino acid side chain 
of His-64. Val-68 and Phe-43 contribute to 
the hydrophobic environment of the oxygen- 
binding site. [PDB 1AGM]. 

The structure of myoglobin and hemoglobin prevents the permanent transfer of an 
electron or irreversible oxidation thereby ensuring the reversible binding of molecular 
oxygen for transport. The ferrous iron atom of heme in hemoglobin is partially oxi- 
dized when 0 2 is bound. An electron is temporarily transferred toward the oxygen atom 
that is attached to the iron so that the molecule of dioxygen is partially reduced. If the 
electron were transferred completely to the oxygen, the complex would be Fe 3+ — 0 2 ® 
(a superoxide anion attached to ferric iron). The globin crevice prevents complete elec- 
tron transfer and enforces return of the electron to the iron atom when 0 2 dissociates. 

B. Oxygen-Binding Curves of Myoglobin and Hemoglobin 

Oxygen binds reversibly to myoglobin and hemoglobin. The extent of binding at equi- 
librium depends on the concentration of the protein and the concentration of oxygen. 
This relationship is depicted in oxygen-binding curves (Figure 4.52). In these figures, 
the fractional saturation ( Y ) of a fixed amount of protein is plotted against the concen- 
tration of oxygen (measured as the partial pressure of gaseous oxygen, p0 2 ). The frac- 
tional saturation of myoglobin or hemoglobin is the fraction of the total number of 
molecules that are oxygenated. 

[Mb0 2 ] 

[Mb0 2 ] + [Mb] 


The oxygen-binding curve of myoglobin is hyperbolic (Figure 4.52), indicating that there 
is a single equilibrium constant for the binding of 0 2 to the macromolecule. In con- 
trast, the curve depicting the relationship between oxygen concentrations and binding 
to hemoglobin is sigmoidal. Sigmoidal (S-shaped) binding curves indicate that more 
than one molecule of ligand is binding to each protein. In this case, up to four mole- 
cules of 0 2 bind to hemoglobin, one per heme group of the tetrameric protein. The 
shape of the curve indicates that the oxygen-binding sites of hemoglobin interact such 
that the binding of one molecule of oxygen to one heme group facilitates binding of 
oxygen molecules to the other hemes. The oxygen affinity of hemoglobin increases as 
each oxygen molecule is bound. This interactive binding phenomenon is termed 
positive cooperativity of binding. 

The partial pressure at half- saturation (P 50 ) is a measure of the affinity of the pro- 
tein for 0 2 . A low P 50 indicates a high affinity for oxygen since the protein is half-satu- 
rated with oxygen at a low oxygen concentration; similarly, a high P 50 signifies a low 
affinity. Myoglobin molecules are half- saturated at a p0 2 of 2.8 torr (1 atmosphere = 
760 torr). The P 50 for hemoglobin is much higher (26 torr) reflecting its lower affinity 
for oxygen. The heme prosthetic groups of myoglobin and hemoglobin are identical but 
the affinities of these groups for oxygen differ because the microenvironments provided 
by the proteins are slightly different. Oxygen affinity is an intrinsic property of the pro- 
tein. It is similar to the equilibrium binding/dissociation constants that are commonly 
used to describe the binding of ligands to other proteins and enzymes (Section 4.9). 

As Figure 4.52 shows, at the highp0 2 found in the lungs (about 100 torr) both myo- 
globin and hemoglobin are nearly saturated. However, at p0 2 values below about 50 torr, 
myoglobin is still almost fully saturated whereas hemoglobin is only partially saturated. 
Much of the oxygen carried by hemoglobin in erythrocytes is released within the capillar- 
ies of tissues where p0 2 is low (20 to 40 torr). Myoglobin in muscle tissue then binds oxy- 
gen released from hemoglobin. The differential affinities of myoglobin and hemoglobin 
for oxygen thus lead to an efficient system for oxygen delivery from the lungs to muscle. 

The cooperative binding of oxygen by hemoglobin can be related to changes in the 
protein conformation that occur on oxygenation. Deoxyhemoglobin is stabilized by 
several intra- and intersubunit ion pairs. When oxygen binds to one of the subunits, 
it causes a movement that disrupts these ion pairs and favors a slightly different conforma- 
tion. The movement is triggered by the reactivity of the heme iron atom (Figure 4.53). 
In deoxyhemoglobin, the iron atom is bound to only five ligands (as in myoglobin). It is 
slightly larger than the cavity within the porphyrin ring and lies below the plane of the ring. 
When 0 2 — the sixth ligand — binds to the iron atom, the electronic structure of the iron 

4.14 Oxygen Binding to Myoglobin and Hemoglobin 125 

(a) (b) 

p0 2 (torr) p0 2 (torr) 

▲ Figure 4.52 

Oxygen-binding curves of myoglobin and hemoglobin, (a) Comparison of myoglobin and hemoglobin. The fractional saturation (VO of each protein is 
plotted against the partial pressure of oxygen (p02). The oxygen-binding curve of myoglobin is hyperbolic, with half-saturation {Y = 0.5) at an oxygen 
pressure of 2.8 torr. The oxygen-binding curve of hemoglobin in whole blood is sigmoidal, with half-saturation at an oxygen pressure of 26 torr. 
Myoglobin has a greater affinity than hemoglobin for oxygen at all oxygen pressures. In the lungs, where the partial pressure of oxygen is high, hemo- 
globin is nearly saturated with oxygen. In tissues, where the partial pressure of oxygen is low, oxygen is released from oxygenated hemoglobin and 
transferred to myoglobin, (b) O 2 binding by the different states of hemoglobin. The oxy (R, or high-affinity) state of hemoglobin has a hyperbolic 
binding curve. The deoxy (T, or low-affinity) state of hemoglobin would also have a hyperbolic binding curve but with a much higher concentration for 
half-saturation. Solutions of hemoglobin containing mixtures of low- and high-affinity forms show sigmoidal binding curves with intermediate oxygen 




Porphyrin plane Fe 

▲ Figure 4.53 

Conformational changes in a hemoglobin chain induced by oxygenation. When the heme iron of a he- 
moglobin subunit is oxygenated (red), the proximal histidine residue is pulled toward the porphyrin 
ring. The helix containing the histidine also shifts position, disrupting ion pairs that cross-link the 
subunits of deoxyhemoglobin (blue). 


CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

changes, its diameter decreases, and it moves into the plane of the porphyrin ring 
pulling the helix that contains the proximal histidine. The change in tertiary structure 
results in a slight change in quaternary structure and this allows the remaining subunits 
to bind oxygen more readily. The entire tetramer appears to shift from the deoxy to the 
oxy conformation only after at least one oxygen molecule binds to each a(3 dimer. (For 
further discussion, see Section 5.9C.) 

The conformational change of hemoglobin is responsible for the positive cooperativ- 
ity of binding seen in the binding curve (Figure 4.52a). The shape of the curve is due to 
the combined effect of the two conformations (Figure 4.52b). The completely deoxy- 
genated form of hemoglobin has a low affinity for oxygen and thus exhibits a hyperbolic 
binding curve with a very high concentration of half- saturation. Only a small amount of 
hemoglobin is saturated at low oxygen concentrations. As the concentration of oxygen in- 
creases, some of the hemoglobin molecules bind a molecule of oxygen and this increases 
their affinity for oxygen so that they are more likely to bind additional oxygen. This causes 
the sigmoidal curve and also a sharp rise in binding. More molecules of hemoglobin are in 
the oxy conformation. If all of the hemoglobin molecules were in the oxy conformation, a 
solution would exhibit a hyperbolic binding curve. Release of the oxygen molecules allows 
the hemoglobin molecule to re-form the ion pairs and resume the deoxy conformation. 

The two conformations of hemoglobin are called the T (tense) and R (relaxed) 
states, using the standard terminology for such conformational changes. In hemoglo- 
bin, the deoxy conformation, which resists oxygen binding, is considered the inactive 
(T) state, and the oxy conformation, which facilitates oxygen binding, is considered the 
active (R) state. The R and T states are in dynamic equilibrium. 


The human a globin genes are located on chromosome 16 in 
a cluster of related members of the globin gene family. There 
are two different genes encoding a globin: aq and a 2 Up- 
stream of these genes there is another functional gene called 
£ (zeta). The locus includes two nonfunctional pseudogenes, 
one related to £ and the other derived from a duplicated 
a globin gene 

The f3 globin gene is on chromosome 1 1 and it is also lo- 
cated at a locus where there are other members of the globin 
gene family. The functional genes are d, two related y globin 
genes (y A and y G ), and an s (epsilon) gene. This locus also 
contains a pseudogene related to [3 (if/p). 

The other globin genes encode hemoglobin subunits that 
are expressed in the early embryo and in the fetus. The embry- 
onic hemoglobins are called Gower 1 (£ 2 s 2 ), Gower 2 (a 2 s 2 ), 
and Portland (£ 272 )- The fetal hemoglobin has the subunit 
composition a 2 y 2 . The adult hemoglobins are a 2 f 3 2 and a 2 8 2 . 

During early embryogenesis, the growing embryo gets 
oxygen from the mother’s blood through the placenta. 
The concentration of oxygen in the embryo is much lower 
than the concentration of oxygen in adult blood. The embry- 

onic hemoglobins compensate by binding oxygen much more 
tightly, their P 50 values range from 4 to 12 torr — much lower 
than the value of adult hemoglobin (26 torr). The fetal hemo- 
globins bind oxygen less tightly than the embryonic hemoglo- 
bin but tighter than the adult hemoglobins (P 50 = 20 torr). 

Expression of the various globin genes is carefully regu- 
lated so that the right genes are transcribed at the right time. 
Sometimes mutations arise where the fetal y globin genes are 
inappropriately expressed in adults. The result is a phenotype 
known as Hereditary Persistence of Fetal Hemoglobin 
(HPFH). This is just one of hundreds of hemoglobin variants 
that have been detected in humans. You can read about them 
on a database called Online Mendelian Inheritance in Man 
(OMIM), the most complete and accurate database of 
human genetic diseases ( 

► Human fetus. 

Chromosome 16 

◄ Globin genes, 
cq a 2 

y G ? A 

s P 

Chromosome 1 1 

4.14 Oxygen Binding to Myoglobin and Hemoglobin 127 

▲ Julian Voss-Andreae created a sculpture called “Heart of Steel (Hemoglobin)” in 2005 in the City of Lake Oswego, Oregon. The sculpture is a 
depiction of a hemoglobin molecule with a bound oxygen atom. The original sculpture was shiny steel (left). After 10 days (middle) it had started 
to rust as the iron in the steel reacted with oxygen in the atmosphere. After several months (right) the sculpture was completely rust colored. 

C. Hemoglobin Is an Allosteric Protein 

The binding and release of oxygen by hemoglobin are regulated by allosteric interactions 
(from the Greek alios , “other”). In this respect, hemoglobin — a carrier protein, not an 
enzyme — resembles certain regulatory enzymes (Section 5.9). Allosteric interactions 
occur when a specific small molecule, called an allosteric modulator, or allosteric effector, 
binds to a protein (usually an enzyme) and modulates its activity. The allosteric modu- 
lator binds reversibly at a site separate from the functional binding site of the protein. 
An effector molecule may be an activator or an inhibitor. A protein whose activity is 
modulated by allosteric effectors is called an allosteric protein. 

Allosteric modulation is accomplished by small but significant changes in the con- 
formations of allosteric proteins. It involves cooperativity of binding that is regulated by 
binding of the allosteric effector to a distinct site that doesn’t overlap the normal bind- 
ing site of a substrate, product, or transported molecule such as oxygen. An allosteric 
protein is in an equilibrium in which its active shape (R state) and its inactive shape 
(T state) are rapidly interconverting. A substrate, which obviously binds at the active 
site (to heme in hemoglobin), binds most avidly when the protein is in the R state. An 
allosteric inhibitor, which binds at an allosteric or regulatory site, binds most avidly to 
the T state. The binding of an allosteric inhibitor to its own site causes the allosteric 
protein to change rapidly from the R state to the T state. The binding of a substrate to 
the active site (or an allosteric activator to the allosteric site) causes the reverse change. 
The change in conformation of an allosteric protein caused by binding or release of an 
effector extends from the allosteric site to the functional binding site (the active site). 
The activity level of an allosteric protein depends on the relative proportions of mole- 
cules in the R and T forms and these, in turn, depend on the relative concentrations of 
the substrates and modulators that bind to each form. 

The molecule 2,3-frisphospho-D-glycerate (2,3BPG) is an allosteric effector of 
mammalian hemoglobin. The presence of 2,3BPG in erythrocytes raises the P 50 for 
binding of oxygen to adult hemoglobin to about 26 torr — much higher than the P 50 for 
oxygen binding to purified hemoglobin in aqueous solution (about 12 torr). In other 
words, 2,3BPG in erythrocytes substantially lowers the affinity of deoxyhemoglobin for 
oxygen. The concentrations of 2,3BPG and hemoglobin within erythrocytes are nearly 
equal (about 4.7 mM). 





H — C — OPO 


1 © 

ch 2 opo 3 ^ 

▲ 2,3-Bisphospho-D-glycerate (2,3BPG). 

The synthesis of 2,3BPG in red blood 
cells is described in Box 11.2 
(Chapter 11). 

128 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

Figure 4.54 ► 

Binding of 2,3BPG to deoxyhemoglobin. The 

central cavity of deoxyhemoglobin is lined 
with positively charged groups that are com- 
plementary to the carboxylate and phos- 
phate groups of 2,3BPG. Both 2,3BPG and 
the ion pairs shown help stabilize the deoxy 
conformation. The a subunits are shown in 
pink, the (3 subunits in blue, and the heme 
prosthetic groups in red. 

R and T conformations are explained 
more thoroughly in Section 5.10, 
“Theory of Allostery.” 

▲ Figure 4.55 

Bohr effect. Lowering the pH decreases the 
affinity of hemoglobin for oxygen. 

The effector 2,3BPG binds in the central cavity of hemoglobin between the two 
[3 subunits. In this binding pocket there are six positively charged side chains and the 
N-terminal a-amino group of each / 3 chain forming a cationic binding site (Figure 4.54). 
In deoxyhemoglobin, these positively charged groups can interact electrostatically with 
the five negative charges of 2,3BPG. When 2,3BPG is bound, the deoxy conformation 
(the T state, which has a low affinity for 0 2 ) is stabilized and conversion to the oxy con- 
formation (the R or high-affinity state) is inhibited. In oxyhemoglobin, the [3 chains are 
closer together and the allosteric binding site is too small to accommodate 2,3BPG. The 
reversibly bound ligands 0 2 and 2,3BPG have opposite effects on the R T equilib- 
rium. Oxygen binding increases the proportion of hemoglobin molecules in the oxy (R) 
conformation and 2,3BPG binding increases the proportion of hemoglobin molecules 
in the deoxy (T) conformation. Because oxygen and 2,3BPG have different binding 
sites, 2,3BPG is a true allosteric effector. 

In the absence of 2,3BPG, hemoglobin is nearly saturated at an oxygen pressure of 
about 20 torn Thus, at the low partial pressure of oxygen that prevails in the tissues (20 to 
40 torr), hemoglobin without 2,3BPG would not unload its oxygen. In the presence of 
equimolar 2,3BPG, however, hemoglobin is only about one-third saturated at 20 torr. The 
allosteric effect of 2,3BPG causes hemoglobin to release oxygen at the low partial pressures 
of oxygen in the tissues. In muscle, myoglobin can bind some of the oxygen that is released. 

Additional regulation of the binding of oxygen to hemoglobin involves carbon 
dioxide and protons, both of which are products of aerobic metabolism. C0 2 decreases 
the affinity of hemoglobin for 0 2 by lowering the pH inside red blood cells. Enzyme- 
catalyzed hydration of C0 2 in erythrocytes produces carbonic acid, H 2 C0 3 , which dis- 
sociates to form bicarbonate and a proton thereby lowering the pH. 

C0 2 + H 2 0 H 2 C0 3 H© + HC0 3 © (4.4) 

The lower pH leads to protonation of several groups in hemoglobin. These groups then 
form ion pairs that help stabilize the deoxy conformation. The increase in the concentration 
of C0 2 and the concomitant decrease in pH raise the P 50 of hemoglobin (Figure 4.55). 
This phenomenon, called the Bohr effect, increases the efficiency of the oxygen delivery 
system. In inhaling lungs, where the C0 2 level is low, 0 2 is readily picked up by 
hemoglobin; in metabolizing tissues, where the C0 2 level is relatively high and the pH is 
relatively low, 0 2 is readily unloaded from oxyhemoglobin. 

4.15 Antibodies Bind Specific Antigens 129 

Carbon dioxide is transported from the tissues to the lungs in two ways. Most C0 2 
produced by metabolism is transported as dissolved bicarbonate ions but some carbon 
dioxide is carried by hemoglobin itself the form of carbamate adducts (Figure 4.56). At 
the pH of red blood cells (7.2) and at high concentrations of C0 2 , the unprotonated 
amino groups of the four N- terminal residues of deoxyhemoglobin (pFC a values between 7 
and 8) can react reversibly with C0 2 to form carbamate adducts. The carbamates of oxy- 
hemoglobin are less stable than those of deoxyhemoglobin. When hemoglobin reaches 
the lungs, where the partial pressure of C0 2 is low and the partial pressure of 0 2 is high, 
hemoglobin is converted to its oxygenated state and the C0 2 that was bound is released. 

4.15 Antibodies Bind Specific Antigens 

Vertebrates possess a complex immune system that eliminates foreign substances includ- 
ing infectious bacteria and viruses. As part of this defense system, vertebrates synthesize 
proteins called antibodies (also known as immunoglobulins) that specifically recognize 
and bind antigens. Many different types of foreign compounds can serve as antigens that 
produce an immune response. Antibodies are synthesized by white blood cells called 
lymphocytes — each lymphocyte and its descendants synthesize the same antibody. Be- 
cause animals are exposed to many foreign substances over their lifetimes, they develop a 
huge array of antibody-producing lymphocytes that persist at low levels for many years 
and can later respond to the antigen during reinfection. The memory of the immune sys- 
tem is the reason certain infections do not recur in an individual despite repeated expo- 
sure. Vaccines (inactivated pathogens or analogs of toxins) administered to children are 
effective because immunity established in childhood lasts through adulthood. 

When an antigen — either novel or previously encountered — binds to the surface of 
lymphocytes, these cells are stimulated to proliferate and produce soluble antibodies for 
secretion into the bloodstream. The soluble antibodies bind to the foreign organism or 
substance forming antibody-antigen complexes that precipitate and mark the antigen 
for destruction by a series of interacting proteases or by lymphocytes that engulf the 
antigen and digest it intracellularly. 

The most abundant antibodies in the bloodstream are of the immunoglobulin 
G class (IgG). These are Y-shaped oligomers composed of two identical light chains and 
two identical heavy chains connected by disulfide bonds (Figure 4.57). Immunoglobulins 
are glycoproteins containing covalently bound carbohydrates attached to the heavy 
chains. The N-termini of pairs of light and heavy chains are close together. Light chains contain 
two domains and heavy chains contain four domains. Each of the domains consists of 






o — 



c — N — R 


▲ Figure 4.56 

Carbamate adduct. Carbon dioxide produced 
by metabolizing tissues can react reversibly 
with the N-terminal residues of the globin 
chains of hemoglobin, converting them to 
carbamate adducts. 







◄ Figure 4.57 

Human antibody structure, (a) Structure. 

I I (b) Diagram. Two heavy chains (blue) and 

two light chains (red) of antibodies of the 
immunoglobulin G class are joined by disul- 
fide bonds (yellow). The variable domains of 
I | both the light and heavy chains (where 

®OOC COO® antigen binds) are colored more darkly. 

130 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 

▲ Figure 4.58 

The immunoglobulin fold. The domain con- 
sists of a sandwich of two antiparallel 
/ 3 sheets. [PDB 1REI]. 

about 110 residues assembled into a common motif called the immunoglobulin fold whose 
characteristic feature is a sandwich composed of two antiparallel /3 sheets (Figure 4.58). 
This domain structure is found in many other proteins of the immune system. 

The N-terminal domains of antibodies are called the variable domains because 
of their sequence diversity. They determine the specificity of antigen binding. X-ray 
crystallographic studies have shown that the antigen-binding site of a variable do- 
main consists of three loops, called hypervariable regions, that differ widely in size 
and sequence. The loops from a light chain and a heavy chain combine to form a 
barrel, the upper surface of which is complementary to the shape and polarity of a 
specific antigen. The match between the antigen and antibody is so close that there 
is no space for water molecules between the two. The forces that stabilize the inter- 
action of antigen with antibody are primarily hydrogen bonds and electrostatic in- 
teractions. An example of the interaction of antibodies with a protein antigen is 
shown in Figure 4.59. 

Antibodies are used in the laboratory for the detection of small quantities of vari- 
ous substances because of their remarkable antigen-binding specificity. In a common 
type of immunoassay, fluid containing an unknown amount of antigen is mixed with a 
solution of labeled antibody and the amount of antibody-antigen complex formed is 
measured. The sensitivity of these assays can be enhanced in a variety of ways to make 
them suitable for diagnostic tests. 

▲ Figure 4.59 

Binding of three different antibodies to an antigen (the protein lysozyme). The structures of the three 
antigen-antibody complexes have been determined by X-ray crystallography. This composite view, 
in which the antigen and antibodies have been separated, shows the surfaces of the antigen and 
antibodies that interact. Only parts of the three antibodies are shown. 


1. Proteins fold into many different shapes, or conformations. Many 
proteins are water-soluble, roughly spherical, and tightly folded. 
Others form long filaments that provide mechanical support to 
cells and tissues. Membrane proteins are integral components of 
membranes or are associated with membranes. 

2. There are four levels of protein structure: primary (sequence of 
amino acid residues), secondary (regular local conformation, 
stabilized by hydrogen bonds), tertiary (compacted shape of the 
entire polypeptide chain), and quaternary (assembly of two or 
more polypeptide chains into a multisubunit protein). 

3. The three-dimensional structures of biopolymers, such as 
proteins can be determined by X-ray crystallography and NMR 

4. The peptide group is polar and planar. Rotation around the 
N — C a and C a — C bonds is described by <p and if/. 

5. The a helix, a common secondary structure, is a coil containing 
approximately 3.6 amino acid residues per turn. Hydrogen bonds 
between amide hydrogens and carbonyl oxygens are roughly par- 
allel to the helix axis. 

Problems 131 

6. The other common type of secondary structure, /3 structure, 
often consists of either parallel or antiparallel /3 strands that are 
hydrogen-bonded to each other to form /3 sheets. 

7. Most proteins include stretches of nonrepeating conformation, 
including turns and loops that connect a helices and (3 strands. 

8. Recognizable combinations of secondary structural elements are 
called motifs. 

9. The tertiary structure of proteins consists of one or more do- 
mains, which may have recognizable structures and may be asso- 
ciated with particular functions. 

10 . In proteins that possess quaternary structure, subunits are usually 
held together by noncovalent interactions. 

11. The native conformation of a protein can be disrupted by the ad- 
dition of denaturing agents. Renaturation may be possible under 
certain conditions. 

12 . Folding of a protein into its biologically active state is a sequen- 
tial, cooperative process driven primarily by the hydrophobic ef- 
fect. Folding can be assisted by chaperones. 

13 . Collagen is the major fibrous protein of connective tissues. The 
three left-handed helical chains of collagen form a right-handed 

14. The compact, folded structures of proteins allow them to selectively 
bind other molecules. The heme-containing proteins myoglobin 
and hemoglobin bind and release oxygen. Oxygen binding to he- 
moglobin is characterized by positive cooperativity and allosteric 

15 . Antibodies are multidomain proteins that bind foreign substances, 
or antigens, marking them for destruction. The variable domains 
at the ends of the heavy and light chains interact with the antigen. 


1. Examine the following tripeptide: 


H 3 N 





r 3 h 


(a) Label the a-carbon atoms and draw boxes around the atoms 
of each peptide group. 

(b) What do the R groups represent? 

(c) Why is there limited free rotation around the carbonyl C = O 
to N amide bonds? 

(d) Assuming that the chemical structure represents the correct 
conformation of the peptide linkage, are the peptide groups 
in the cis or the trans conformation? 

(e) Which bonds allow rotation of peptide groups with respect 
to each other? 

2. (a) Characterize the hydrogen-bonding pattern of (1) an a helix 

and (2) a collagen triple helix. 

(b) Explain how the amino acid side chains are arranged in each 
of these helices. 

3. Explain why (1) glycine and (2) proline residues are not com- 
monly found in a helices. 

4. A synthetic 20 amino acid polypeptide named Betanova was de- 
signed as a small soluble molecule that would theoretically form 
stable /3-sheet structures in the absence of disulfide bonds. NMR 
of Betanova in solution indicates that it does, in fact, form a 
three-stranded antiparallel /3 sheet. Given the sequence of Be- 
tanova below: 

(a) Draw a ribbon diagram for Betanova indicating likely 
residues for each hairpin turn between the /3 strands. 

(b) Show the interactions that are expected to stabilize this 
/3-sheet structure. 


5 . Each member of an important family of 250 different DNA-binding 
proteins is composed of a dimer with a common protein motif. 
This motif permits each DNA-binding protein to recognize and 
bind to specific DNA sequences. What is the common protein 
motif in the structure below? 

6. Refer to Figure 4.21 to answer the following questions. 

(a) To which of the four major domain categories does the middle 
domain of pyruvate kinase (PK) belong (all a all / 3 , a//3, a + /3)? 

(b) Describe any characteristic domain “fold” that is prominent 
in this middle domain of PK. 

(c) Identify two other proteins that have the same fold as the 
middle domain of pyruvate kinase. 

7. Protein disulfide isomerase (PDI) markedly increases the rate of 
correct refolding of the inactive ribonuclease form with random 
disulfide bonds (Figure 4.35). Show the mechanism for the PDI- 
catalyzed rearrangement of a nonnative (inactive) protein with 
incorrect disulfide bonds to the native (active) protein with cor- 
rect disulfide bonds. 

132 CHAPTER 4 Proteins: Three-Dimensional Structure and Function 





8. Myoglobin contains eight a helices, one of which has the follow- 
ing sequence: 



Which side chains are likely to be on the side of the helix that faces 
the interior of the protein? Which are likely to be facing the aqueous 
solvent? Account for the spacing of the residues facing the interior. 

9. Homocysteine is an a-amino acid containing one more methylene 
group in its side chain than cysteine (side chain = — CH 2 CH 2 SH). 
Homocysteinuria is a genetic disease characterized by elevated 
levels of homocysteine in plasma and urine, as well as skeletal de- 
formities due to defects in collagen structure. Homocysteine re- 
acts readily with allysine under physiological conditions. Show 
this reaction and suggest how it might lead to defective cross- 
linking in collagen. 

10. The larval form of the parasite Schistosoma mansoni infects hu- 
mans by penetrating the skin. The larva secretes enzymes that 
catalyze the cleavage of peptide bonds between residues X and Y 
in the sequence -Gly-Pro-X-Y- (X and Y can be any of several 
amino acids). Why is this enzyme activity important for the parasite? 

11 . (a) How does the reaction of carbon dioxide with water help ex- 

plain the Bohr effect? Include the equation for the formation 
of bicarbonate ion from C0 2 and water, and explain the ef- 
fects of H® and C0 2 on hemoglobin oxygenation. 

(b) Explain the physiological basis for the intravenous adminis- 
tration of bicarbonate to shock victims. 

12 . Fetal hemoglobin (Hb F) contains serine in place of the cationic 
histidine at position 143 of the p chains of adult hemoglobin (Hb A). 
Residue 143 faces the central cavity between the ft chains. 

(a) Why does 2,3BPG bind more tightly to deoxy Hb A than to 
deoxy Hb F? 

(b) How does the decreased affinity of Hb F for 2,3BPG affect the 
affinity of Hb F for 0 2 ? 

(c) The P 50 for Hb F is 18 torr, and the P 50 for Hb A is 26 torr. 
How do these values explain the efficient transfer of oxygen 
from maternal blood to the fetus? 

13 . Amino acid substitutions at the aft subunit interfaces of hemo- 
globin may interfere with the R v T quaternary structural 
changes that take place on oxygen binding. In the hemoglobin 
variant Hb Ya kima> the R form is stabilized relative to the T form, 
and P 50 =12 torr. Explain why the mutant hemoglobin is less effi- 
cient than normal hemoglobin (P 50 = 26 torr) in delivering oxy- 
gen to working muscle, where 0 2 may be as low as 10 to 20 torr. 

14 . The spider venom from the Chilean Rose Tarantula ( Grammostola 
spatulata) contains a toxin that is a 34-amino acid protein. It is 
thought to be a globular protein that partitions into the lipid 
membrane to exert its effect. The sequence of the protein is: 


(a) Identify the hydrophobic and highly hydrophilic amino acids 
in the protein. 

(b) The protein is thought to have a hydrophobic face that interacts 
with the lipid membrane. How can the hydrophobic amino 
acids far apart in sequence interact to form a hydrophobic face? 

[Adapted from Fee, S. and MacKinnon, R. (2004). Nature 430: 

15 . Selenoprotein P is an unusual extracellular protein that contains 
8-10 selenocysteine residues and has a high content of cysteine 
and histidine residues. Selenoprotein P is found both as a plasma 
protein and as a protein strongly associated with the surface 
of cells. The association of selenoprotein P with cells is pro- 
posed to occur through the interaction of selenoprotein P 
with high-molecular-weight carbohydrate compounds classi- 
fied as glycosaminoglycans. One such compound is heparin 
(see structure on next page). Binding studies of selenoprotein P 
to heparin were carried out under different pH conditions. The 
results are shown in the graph on next page. 

(a) How is the binding of selenoprotein P to heparin dependent 
upon pH? 

(b) Give possible structural reasons for the binding dependence. 

(Hint: Use the information about which amino acids are 
abundant in selenoprotein P in your answer) . 

[Adapted from Arteel, G. E., Franken, S., Kappler, J., and Sies, H. 
(2000). Biol Chem. 381:265-268.] 

Selected Readings 133 

16 . Gelatin is processed collagen that comes from the joints of ani- 
mals. When gelatin is mixed with hot water, the triple helix struc- 
ture unwinds and the chains separate, becoming random coils 
that dissolve in the water. As the dissolved gelatin mixture cools, 
the collagen forms a matrix that traps water; as a result, the mix- 
ture turns into the jiggling semisolid mass that is recognizable as 
Jell-O™. The directions on a box of gelatin include the following: 
“Chill until slightly thickened, then add 1 to 2 cups cooked or raw 
fruits or vegetables. Fresh or frozen pineapple must be cooked be- 
fore adding.” If the pineapple is not cooked, the gelatin will not 
set properly. Pineapple belongs to a group of plants called 
Bromeliads and contains a protease called bromelain. Explain 
why pineapple must be cooked before adding to gelatin. 

17 . Hb Helsinki (HbH) is a hemoglobin mutant in which the lysine 
residue at position 82 has been replaced with methionine. The 
mutation is in the beta chain, and residue 82 is found in the central 
cavity of hemoglobin. The oxygen binding curves for normal adult 
hemoglobin (HbA, •) and HbH (■) at pH 7.4 in the presence of a 
physiological concentration of 2,3BPG are shown in the graph. 

[Adapted from Ikkala, E., Koskela, J., Pikkarainen, P., Rahiala, E.L., 
El-Hazmi, M. A., Nagai, K., Lang, A., and Lehmann, H. Acta Haematol. 
(1976). 56:257-275.] 

Explain why the curve for HbH is shifted from the curve for HbA. 
Does this mutation stabilize the R or T state? What result does this 
mutation have on oxygen affinity? 

Selected Readings 


Clothia, C., and Gough, J. (2009). Genomic and 
structural aspects of protein evolution. Biochem. J. 
419:15-28. doi: 10,1042/BJ20090122. 

Creighton, T. E. (1993). Proteins: Structures and 
Molecular Properties, 2nd ed. (New York: W. H. 
Freeman), Chapters 4-7. 

Fersht, A. (1998). Structure and Mechanism in Pro- 
tein Structure (New York: W. H. Freeman). 

Goodsell, D., and Olson, A. J. (1993). Soluble pro- 
teins: size, shape, and function. Trends Biochem. 

Sci. 18:65-68. 

Goodsell, D. S., and Olson, A. J. (2000). Structural 
symmetry and protein function. Annu. Rev. Biophys, 
Biomolec. Struct. 29:105-153. 

Kyte, J. (1995). Structure in Protein Chemistry 
(New York: Garland) . 

Protein Structure 

Branden, C., and Tooze, J. (1991). Introduction to 
Protein Structure 2nd ed. (New York: Garland). 

Chothia, C., Hubbard, T., Brenner, S., Barns, H., 
and Murzin, A. (1997). Protein folds in the all-yS 
and all-u classes. Annu. Rev. Biophys. Biomol. Struct. 

Edison, A. S. (2001). Linus Pauling and the planar 
peptide bond. Nat. Struct. Biol. 8:201-202. 

Harper, E. T., and Rose, G. D. (1993). Helix stop sig- 
nals in proteins and peptides: the capping box. 
Biochemistry 32:7605-7609. 

Phizicky, E., and Fields, S. (1995). Protein-protein 
interactions: methods for detection and analysis. 
Microbiol. Rev. 59:94-123. 

Rhodes, G. (1993). Crystallography Made Crystal 
Clear (San Diego: Academic Press). 

Richardson, J. S., and Richardson, D. C. (1989). 
Principles and patterns of protein conformation. In 
Prediction of Protein Structure and the Principles of 
Protein Conformation, G. D. Fasman, ed. (New 
York: Plenum), pp. 1-98. 

Wang, Y., Liu, C., Yang, D., and Yu, H. (2010). 
PinlAt encoding a peptidyl-prolyl cis/trans iso- 
merase regulates flowering time in arabidopsis. 
Molec. Cell. 37:112-122. 

Uversky, V. N., and Dunker, A. K. (2010). Under- 
standing protein non-folding. Biochim. Biophys. 
Acta. 1804:1231-1264. 

Protein Folding and Stability 

Daggett, V., and Fersht, A. R. (2003). Is there a uni- 
fying mechanism for protein folding? Trends 
Biochem. Sci. 28:18-25. 

Dill, K. A. Ozkan, S. B., Shell, M. S., and Weik, T. R. 
(2008). The protein folding problem. Annu. Rev. 
Biophys. 37:289-316. 

Feldman, D. E., and Frydman, J. (2000). Protein 
folding in vivo: the importance of molecular chap- 
erones. Curr. Opin. Struct. Biol. 10:26-33. 

Kryshtafovych, A., Fidelis, K., and Moult, J. (2009). 
CASP8 results in context of previous experiments. 
Proteins. 77(suppl 9):217-228. 

Matthews, B. W. (1993). Structural and genetic 
analysis of protein stability. Annu. Rev. Biochem. 

Saibil, H. R. and Ranson, N. A. (2002). The chaper- 
onin folding machine. Trends Biochem. Sci. 

Sigler, P. B., Xu, Z., Rye, H. S., Burston, S. G., Fen- 
ton, W. A., and Horwich, A. L. (1998). Structure 
and function in GroEL- mediated protein folding. 
Annu. Rev. Biochem. 67:581-608. 

Smith, C. A. (2000). How do proteins fold? 
Biochem. Ed. 28:76-79. 

Specific Proteins 

Ackers, G. K., Doyle, M. L., Myers, D., and Daugh- 
erty, M. A. (1992). Molecular code for cooperativ- 
ity in hemoglobin. Science 255:54-63. 

Brittain, T. (2002). Molecular aspects of embry- 
onic hemogloin function. Molec. Aspects Med. 

Davies, D. R., Padlan, E. A., and Sheriff, S. (1990). 
Antibody-antigen complexes. Annu. Rev. Biochem. 

Eaton, W. A., Henry, E. R., Hofrichter, J., and Moz- 
zarelli, A. (1999). Is cooperative binding by hemo- 
globin really understood? Nature Struct. Biol. 
6(4):351-3 57. 

Kadler, K. (1994). Extracellular matrix 1: 
fibril-forming collagens. Protein Profile 

Liu, R., and Ochman, H. (2007). Stepwise forma- 
tion of the bacterial flagellar system. Proc. Natl. 
Acad. Sci. (USA). 104:7116-7121. 

Perutz, M. F. (1978). Hemoglobin structure and 
respiratory transport. Sci. Am. 239(6):92-125. 

Perutz, M. F., Wilkinson, A. J., Paoli, M., and 
Dodson, G. G. (1998). The stereochemical 
mechanism of the cooperative effects in 
hemoglobin revisited. Annu. Rev. Biophys. Biomol. 
Struct. 27:1-34. 

Properties of Enzymes 

W e have seen how the three-dimensional shapes of proteins allow them to 
serve structural and transport roles. We now discuss their functions as en- 
zymes. Enzymes are extraordinarily efficient, selective, biological catalysts. 
Every living cell has hundreds of different enzymes catalyzing the reactions essential for 
life — even the simplest living organisms contain hundreds of different enzymes. In 
multicellular organisms, the complement of enzymes differentiates one cell type from 
another but most of the enzymes we discuss in this book are among the several hundred 
common to all cells. These enzymes catalyze the reactions of the central metabolic path- 
ways necessary for the maintenance of life. 

In the absence of the enzymes, metabolic reactions will not proceed at significant 
rates under physiological conditions. The primary role of enzymes is to enhance the 
rates of these reactions to make life possible. Enzyme -catalyzed reactions are 10 3 to 10 20 
times faster than the corresponding uncatalyzed reactions. A catalyst is defined as a 
substance that speeds up the attainment of equilibrium. It may be temporarily changed 
during the reaction but it is unchanged in the overall process since it recycles to partici- 
pate in multiple reactions. Reactants bind to a catalyst and products dissociate from it. 
Note that a catalyst does not change the position of the reactions equilibrium (i.e., it 
does not make an unfavorable reaction favorable). Instead, it lowers the amount of en- 
ergy needed in order for the reaction to proceed. Catalysts speed up both the forward 
and reverse reactions by converting a one- or two-step process into several smaller steps 
each needing less energy than the uncatalyzed reaction. 

Enzymes are highly specific for the reactants, or substrates, they act on, but the de- 
gree of substrate specificity varies. Some enzymes act on a group of related substrates, 
and others on only a single compound. Many enzymes exhibit stereospecificity meaning 

I was awed by enzymes and fell 
instantly in love with them. I have 
since had love affairs with many 
enzymes (none as enduring as with 
DNA polymerase ), but I have never 
met a dull or disappointing one. 

—Arthur Kornberg (2001) 


Catalysts speed up the rate of 
forward and reverse reactions but 
they don’t change the equilibrium 

Top:The enzyme acetylcholinesterase with the reversible inhibitor donepezil hydrochloride (Aricept; shown in red) occupy- 
ing the active site. Aricept is used to improve mental functioning in patients with Alzheimer’s disease. It is thought to act 
by inhibiting the breakdown of the neurotransmitter acetylcholine in the brain, thus prolonging the neurotransmitter ef- 
fects. (It does not, however, affect the course of the disease.) [PDB 1EVE] 


Properties of Enzymes 


▲ Enzyme reaction. This is a large-scale enzyme reaction where milk is being curdled to make 
Appenzeller cheese. The reaction is catalyzed by rennet (rennin), which was originally derived from 
cow stomach. Rennet contains the enzyme chymosin, a protease that cleaves the milk protein 
casein between phenylalanine and methionine residues. The reaction releases a hydrophobic 
fragment of casein that aggregates and precipitates forming curd. 

that they act on only a single stereoisomer of the substrate. Perhaps the most important 
aspect of enzyme specificity is reaction specificity — that is, the lack of formation of 
wasteful by-products. Reaction specificity is reflected in the exceptional purity of prod- 
uct (essentially 100%) — much higher than the purity of products of typical catalyzed 
reactions in organic chemistry. The specificity of enzymes not only saves energy for cells 
but also precludes the buildup of potentially toxic metabolic by-products. 

Enzymes can do more than simply increase the rate of a single, highly specific reac- 
tion. Some can also combine, or couple, two reactions that would normally occur sepa- 
rately. This property allows the energy gained from one reaction to be used in a second 
reaction. Coupled reactions are a common feature of many enzymes — the hydrolysis of 
ATP, for example, is often coupled to less favorable metabolic reactions. 

Some enzymatic reactions function as control points in metabolism. As we will see, 
metabolism is regulated in a variety of ways including alterations in the concentrations 
of enzymes, substrates, and enzyme inhibitors and modulation of the activity levels of 
certain enzymes. Enzymes whose activity is regulated generally have a more complex 
structure than unregulated enzymes. With few exceptions, regulated enzymes are 
oligomeric molecules that have separate binding sites for substrates and effectors, the 
compounds that act as regulatory signals. The fact that enzyme activity can be regulated 
is an important property that distinguishes biological catalysts from those encountered 
in a chemistry lab. 

The word enzyme is derived from a Greek word meaning “in yeast.” It indicates that 
these catalysts are present inside cells. In the late 1800s, scientists studied the fermentation 
of sugars by yeast cells. Vitalists (who maintained that organic compounds could be 
made only by living cells) said that intact cells were needed for fermentation. Mechanists 
claimed that enzymes in yeast cells catalyze the reactions of fermentation. The latter 
conclusion was supported by the observation that cell- free extracts of yeast can catalyze 
fermentation. This finding was soon followed by the identification of individual reactions 
and the enzymes that catalyze them. 

A generation later, in 1926, James B. Sumner crystallized the first enzyme (urease) 
and proved that it is a protein. Five more enzymes were purified in the next decade and 
also found to be proteins: pepsin, trypsin, chymotrypsin, carboxypeptidase, and Old 
Yellow Enzyme (a flavoprotein NADPH oxidase). Since then, almost all enzymes have 
been shown to be proteins or proteins plus cofactors. Certain RNA molecules also ex- 
hibit catalytic activity but they are not usually referred to as enzymes. 

Some of the first biochemistry depart- 
ments in universities were called 
Departments of Zymology. 

Catalytic RNA molecules are discussed 
in Chapters 21 and 22. 

136 CHAPTER 5 Properties of Enzymes 

▲ Crystals of a bacterial ( Shewanella 
oneidensis ) homologue of Old Yellow Enzyme. 

(Courtesy of J. Elegheert and S. N. 

We begin this chapter with a description of enzyme classification and nomencla- 
ture. Next, we discuss kinetic analysis (measurements of reaction rates) emphasizing 
how kinetic experiments can reveal the properties of an enzyme and the nature of the 
complexes it forms with substrates and inhibitors. Finally, we describe the principles of 
inhibition and activation of regulatory enzymes. Chapter 6 explains how enzymes work 
at the chemical level and uses serine proteases to illustrate the relationship between pro- 
tein structure and enzymatic function. Chapter 7 is devoted to the biochemistry of 
coenzymes, the organic molecules that assist some enzymes in their catalytic roles by 
providing reactive groups not found on amino acid side chains. In the remaining chapters 
we will present many other examples illustrating the four main properties of enzymes: 
(1) they function as catalysts, (2) they catalyze highly specific reactions, (3) they can 
couple reactions, and (4) their activity can be regulated. 

5.1 The Six Classes of Enzymes 

Most of the classical metabolic enzymes are named by adding the suffix -ase to the 
name of their substrates or to a descriptive term for the reactions they catalyze. For ex- 
ample, urease has urea as a substrate. Alcohol dehydrogenase catalyzes the removal of 
hydrogen from alcohols (i.e., the oxidation of alcohols). A few enzymes, such as trypsin 
and amylase, are known by their historic names. Many newly discovered enzymes are 
named after their genes or for some nondescriptive characteristic. For example, RecA is 
named after the recA gene and HSP70 is a heat shock protein — both enzymes catalyze 
the hydrolysis of ATR 

A committee of the International Union of Biochemistry and Molecular Biology 
(IUBMB) maintains a classification scheme that categorizes enzymes according to the 
general class of organic chemical reaction that is catalyzed. The six categories — 
oxidoreductases, transferases, hydrolases, lyases, isomerases, and ligases — are defined 
below with an example of each type of enzyme. The IUBMB classification scheme as- 
signs a unique number, called the enzyme classification number, or EC number, to 
each enzyme. IUBMB also assigns a unique systematic name to each enzyme; it may be 
different from the common name of an enzyme. This book usually refers to enzymes 
by their common names. 

1. Oxidoreductases catalyze oxidation-reduction reactions. Most of these enzymes are 
commonly referred to as dehydrogenases. Other enzymes in this class are called oxi- 
dases, peroxidases, oxygenases, or reductases. There is a trend in biochemistry to 
refer to more and more of these enzymes by their systematic name, oxidoreduc- 
tases, rather than the more common names in the older biochemical literature. 
One example of an oxidoreductase is lactate dehydrogenase (EC also 
called lactate:NAD oxidoreductase. This enzyme catalyzes the reversible conversion 
of L-lactate to pyruvate. The oxidation of L-lactate is coupled to the reduction of 
the coenzyme nicotinamide adenine dinucleotide (NAD®). 



HO — C — H + NAD 


ch 3 





C = 0 + NADH + H 



CH 3 


2. Transferases catalyze group transfer reactions and many require the presence of 
coenzymes. In group transfer reactions a portion of the substrate molecule usually 
binds covalently to the enzyme or its coenzyme. This group includes kinases, 
enzymes that catalyze the transfer of a phosphoryl group from ATP. Alanine 
transaminase, whose systematic name is L- alanine: 2 -oxyglutarate aminotransferase 

5.1 The Six Classes of Enzymes 137 


The enzyme classification number for malate dehydrogenase 
is EC This enzyme has an activity similar to that of 
lactate dehydrogenase described under oxidoreductases (see 
Figure 4.23, Box 13.3). 

The first number identifies this enzyme as a member of the 
first class of enzymes (oxidoreductases). The second number 
identifies the substrate group that malate dehydrogenase recog- 
nizes. Subclass 1.1 means that the substrate is a HC — OH 
group. The third number specifies the electron acceptor for 
this class of enzymes. Subclass 1.1.1 is for enzymes that use 
NAD + or NADP + as an acceptor. The final number means that 
malate dehydrogenase is the 37th enzyme in this category. 

Compare the EC number 
of malate dehydrogenase with 
that of lactate dehydrogenase to 
see how similar enzymes have 
similar classification numbers. 

Accurate enzyme identifi- 
cation and classification is an 
important and essential part of 
modern biological databases. 
The entire classification data- 
base can be seen at www.chem. 

(EC 2.6. 1.2), is a typical transferase. It transfers an amino group from L-alanine to 
a-ketoglutarate (2-oxoglutarate) . 


Hz»N - 


i i 

-c — H + C=0 

Alanine transaminase 

< =± 


I © 

C = 0 + H,N- 


ch 3 


(CH 2 ) 2 


CH 3 




-c — H 


(CH 2 ) 2 





3. Hydrolases catalyze hydrolysis. They are a special class of transferases with water 
serving as the acceptor of the group transferred. Pyrophosphatase is a simple exam- 
ple of a hydrolase. The systematic name of this enzyme is diphosphate phosphohy- 
drolase (EC 3. 6. 1.1). 


0 . 



-o— p— o + h 2 o 



°o o'- 




2 HO — P — O 





4. Lyases catalyze lysis of a substrate generating a double bond in nonhydrolytic, 
nonoxidative, elimination reactions. In the reverse direction, lyases catalyze the ad- 
dition of one substrate to the double bond of a second substrate. Pyruvate decar- 
boxylase belongs to this class of enzymes since it splits pyruvate into acetaldehyde 
and carbon dioxide. The systematic name for pyruvate decarboxylase, 2-oxo-acid 
carboxy-lyase (EC 4. 1.1.1), is rarely used. 

C = 0 + H 


CH 3 


Pyruvate H O 

decarboxylase \ # 

> C + 


CH 3 


0 = C=0 




5. Isomerases catalyze structural change within a single molecule (isomerization reac- 
tions). Because these reactions have only one substrate and one product, they are 
among the simplest enzymatic reactions. Alanine racemase (EC 5. 1.1.1) is an 

▲ Distribution of all known enzymes by EC 
classification number. 1. oxidoreductases; 
2. transferases; 3. hydrolases; 4. lyases; 
5. isomerases; 6. ligases. 

138 CHAPTER 5 Properties of Enzymes 

isomerase that catalyzes the interconversion of L-alanine and D-alanine. The com- 
mon name is the same as the systematic name. 

coo 0 

© I 

H 3 N — C — H 

ch 3 




coo 0 

I © 

H — C — NH 3 

ch 3 



6. Ligases catalyze ligation, or joining, of two substrates. These reactions require the 
input of chemical potential energy in the form of a nucleoside triphosphate 
such as ATP. Ligases are usually referred to as synthetases. Glutamine synthetase, or 
L- glutamate: ammonia ligase (ADP-forming) (EC 6.3. 1.2), uses the energy of ATP 
hydrolysis to join glutamate and ammonia to produce glutamine. 

The human genome contains genes for 
about 1000 different enzymes catalyz- 
ing reactions in several hundred meta- 
bolic pathways ( Since 
many enzymes have multiple subunits 
there are about 3000 different genes 
devoted to making enzymes. We have 
about 20,000 genes so most of the 
genes in our genome do not encode 
enzymes or enzyme subunits. 

coo 0 

© I 

H 3 N — C — H 

I + ATP + NH 4 @ 
(CH 2 ) 2 





© I 

H 3 N — C — H 

I + ADP + P| 

(CH 2 ) 2 (5.6) 



S \ 



From the examples given above we see that most enzymes have more than one sub- 
strate although the second substrate may be only a molecule of water or a proton. Al- 
though enzymes catalyze both forward and reverse reactions, one-way arrows are often 
used when the equilibrium favors a great excess of product over substrate. Remember 
that when a reaction reaches equilibrium the enzyme must be catalyzing both the for- 
ward and reverse reactions at the same rate. 

Recall that concentrations are indicat- 
ed by square brackets: [P] signifies the 
concentration of product, [E] the con- 
centration of enzyme, and [S] the con- 
centration of the substrate. 

5.2 Kinetic Experiments Reveal Enzyme Properties 

We begin our study of enzyme properties by examining the rates of enzyme -catalyzed 
reactions. Such studies fall under the category of enzyme kinetics (from the Greek 
kinetikos , “moving”). This is an appropriate place to begin since the most important 
property of enzymes is that they act as catalysts, speeding up the rates of reactions. En- 
zyme kinetics provides indirect information about the specificities and catalytic mecha- 
nisms of enzymes. Kinetic experiments also reveal whether an enzyme is regulated. 

Most enzyme research in the first half of the 20th century was limited to kinetic ex- 
periments. This research revealed how the rates of reactions are affected by variations in 
experimental conditions or changes in the concentration of enzyme or substrate. Before 
discussing enzyme kinetics in depth, let’s review the principles of kinetics for 
nonenzymatic chemical systems. These principles are then applied to enzymatic reactions. 

A. Chemical Kinetics 

Kinetic experiments examine the relationship between the amount of product (P) 
formed in a unit of time (A[P]/At) and the experimental conditions under which the re- 
action takes place. The basis of most kinetic measurements is the observation that the 
rate, or velocity (v), of a reaction varies directly with the concentration of each reactant 
(Section 1.4). This observation is expressed in a rate equation. For example, the rate 
equation for the nonenzymatic conversion of substrate (S) to product in an isomeriza- 
tion reaction is written as 



v = k[S] 


5.2 Kinetic Experiments Reveal Enzyme Properties 139 

The rate equation reflects the fact that the velocity of a reaction depends on the concen- 
tration of the substrate ([S]). The symbol k is the rate constant and indicates the speed 
or efficiency of a reaction. Each reaction has a different rate constant. The units of the 
rate constant for a simple reaction are s _1 . 

As a reaction proceeds, the amount of product ([P]) increases and the amount of 
substrate ([S]) decreases. An example of the progress of several reactions is shown in 
Figure 5.1a. The velocity is the slope of the progress curve over a particular interval of time. 
The shape of the curves indicates that the velocity is decreasing over time as expected 
since the substrate is being depleted. 

In this hypothetical example, the velocity of the reaction might eventually become 
zero when the substrate is used up. This would explain why the curve flattens out at ex- 
tended time points. (See below for another explanation.) We are interested in the rela- 
tionship between substrate concentration and the velocity of a reaction since if we 
know these two values we can use Equation 5.7 to calculate the rate constant. The only 
accurate substrate concentration is the one we prepare at the beginning of the experi- 
ment because the concentration changes during the experiment. The velocity of the re- 
action at the very beginning is the value that we want to know. This value represents the 
rate of the reaction at a known substrate concentration before it changes. 

The initial velocity (v 0 ) can be determined from the slope of the progress curves 
(Figure 5.1a) or from the derivatives of the curves. A graph of initial velocity versus sub- 
strate concentration at the beginning of the experiment gives a straight line as shown in 
Figure 5.1b. The slope of the curve in Figure 5.1b is the rate constant. 

The experiment shown in Figure 5.1 will only determine the forward rate constant 
since the data were collected under conditions where there was no reverse reaction. This 
is another important reason for calculating initial velocity (v 0 ) rather than the rate at 
later time points. In a reversible reaction, the flattening of the progress curves does not 
represent zero velocity. Instead, it simply indicates that there is no net increase in prod- 
uct over time because the reaction has reached equilibrium. 

A better description of our simple reaction would be 

S P (5.8) 


For a more complicated single-step reaction, such as the reaction S x + S 2 — » Pi + P 2 > the 
rate is determined by the concentrations of both substrates. If both substrates are pres- 
ent at similar concentrations, the rate equation is 

v= /c[S n ][S 2 ] (5.9) 

The rate constant for reactions involving two substrates has the units M -1 s -1 . These 
rate constants can be easily determined by setting up conditions where the concentra- 
tion of one substrate is very high and the other is varied. The rate of the reaction will 
depend on the concentration of the rate-limiting substrate. 

B. Enzyme Kinetics 

One of the first great advances in biochemistry was the discovery that enzymes bind 
substrates transiently. In 1894, Emil Fischer proposed that an enzyme is a rigid tem- 
plate, or lock, and that the substrate is a matching key. Only specific substrates can fit 
into a given enzyme. Early studies of enzyme kinetics confirmed that an enzyme (E) 
binds a substrate to form an enzyme-substrate complex (ES). ES complexes are formed 
when ligands bind noncovalently in their proper places in the active site. The substrate 
interacts transiently with the protein catalyst (and with other substrates in a multisub- 
strate reaction) on its way to forming the product of the reaction. 

Lets consider a simple enzymatic reaction; namely, the conversion of a single sub- 
strate to a product. Although most enzymatic reactions have two or more substrates, the 
general principles of enzyme kinetics can be described by assuming the simple case of 
one substrate and one product. 

E + S > ES > E + P (5.10) 

0.05 M 0.1 M 0.2 M 


▲ Figure 5.1 

Rate of a simple chemical reaction, (a) The 

amount of product produced over time is 
plotted for several different initial substrate 
concentrations. The initial velocity i/ 0 is the 
slope of the progress curve at the beginning 
of the reaction, (b) The initial velocity as a 
function of initial substrate concentration. 
The slope of the curve is the rate constant. 


The rate or velocity of a reaction depends 
on the concentration of substrate. 

140 CHAPTER 5 Properties of Enzymes 


The enzyme-substrate complex (ES) is a 
transient intermediate in an enzyme 
catalyzed reaction. 


▲ Figure 5.2 

Effect of enzyme concentration ([E]), on the 
initial velocity (v) of an enzyme-catalyzed 
reaction at a fixed, saturating [S]. The 

reaction rate is affected by the concentra- 
tion of enzyme but not by the concentration 
of the other reactant, S. 

Time (t) 

▲ Figure 5.3 Progress curve for an enzyme- 
catalyzed reaction. [P], the concentration of 
product, increases as the reaction proceeds. 
The initial velocity of the reaction, i/ 0 , is the 
slope of the initial linear portion of the 
curve. Note that the rate of the reaction 
doubles when twice as much enzyme 
(2E, upper curve) is added to an otherwise 
identical reaction mixture. 

This reaction takes place in two distinct steps — the formation of the enzyme-substrate 
complex and the actual chemical reaction accompanied by the dissociation of the en- 
zyme and product. Each step has a characteristic rate. The overall rate of an enzymatic 
reaction depends on the concentrations of both the substrate and the catalyst (enzyme). 
When the amount of enzyme is much less than the amount of substrate the reaction 
will depend on the amount of enzyme. 

The straight line in Figure 5.2 illustrates the effect of enzyme concentration on the 
reaction velocity in a pseudo first-order reaction. The more enzyme present, the faster 
the reaction. These conditions are used in enzyme assays to determine the concentra- 
tions of enzymes. The concentration of enzyme in a test sample can be easily deter- 
mined by comparing its activity to a reference curve similar to the model curve in 
Figure 5.2. Under these experimental conditions, there are sufficient numbers of sub- 
strate molecules so that every enzyme molecule binds a molecule of substrate to form 
an ES complex, a condition called saturation of E with S. Enzyme assays measure the 
amount of product formed in a given time period. In some assay methods, a recording 
spectrophotometer can be used to record data continuously; in other methods, samples 
are removed and analyzed at intervals. The assay is performed at a constant pH and 
temperature, generally chosen for optimal enzyme activity or for approximation to 
physiological conditions. 

If we begin an enzyme-catalyzed reaction by mixing substrate and enzyme then 
there is no product present during the initial stages of the reaction. Under these condi- 
tions we can ignore the reverse reaction where P binds to E and is converted to S. The 
reaction can be described by 

k-\ k? 

E + S ES — ^ E + P (5.11) 


The rate constants k\ and k- X in Reaction 5.1 1 govern the rates of association of S with E 
and dissociation of S from ES, respectively. This first step is an equilibrium binding in- 
teraction similar to the binding of oxygen to hemoglobin. The rate constant for the sec- 
ond step is k 2 , the rate of formation of product from ES. Note that conversion of the ES 
complex to free enzyme and product is shown by a one-way arrow because the rate of 
the reverse reaction (E + P — » EP) is negligible at the start of a reaction. The velocity 
measured during this short period is the initial velocity (v 0 ) described in the previous 
section. The formation and dissociation of ES complexes are usually very rapid reac- 
tions because only noncovalent bonds are being formed and broken. In contrast, the 
conversion of substrate to product is usually rate limiting. It is during this step that the 
substrate is chemically altered. 

Enzyme kinetics differs from simple chemical kinetics because the rates of enzyme- 
catalyzed reactions depend on the concentration of enzyme and the enzyme is neither a 
product nor a substrate of the reaction. The rates also differ because substrate has to 
bind to enzyme before it can be converted to product. In an enzyme -catalyzed reaction, 
the initial velocities are obtained from progress curves, just as they are in chemical reac- 
tions. Figure 5.3 shows the progress curves at two different enzyme concentrations in 
the presence of a high initial concentration of substrate ([S] » [E] ). In this case, the 
rate of product formation depends on enzyme concentration and not on the substrate 
concentration. Data from experiments such as those shown in Figure 5.3 can be used to 
plot the curve shown in Figure 5.2. 

5.3 The Michaelis-Menten Equation 

Enzyme- catalyzed reactions, like any chemical reaction, can be described mathemati- 
cally by rate equations. Several constants in the equations indicate the efficiency and 
specificity of an enzyme and are therefore useful for comparing the activities of several 
enzymes or for assessing the physiological importance of a given enzyme. The first rate 
equations were derived in the early 1900s by examining the effects of variations in sub- 
strate concentration. Figure 5.4 a shows a typical result where the initial velocity (v 0 ) of 
a reaction is plotted against the substrate concentration ( [S] ). 

5.3 The Michael is-Menten Equation 141 

The data can be explained by the reaction shown in Reaction 5.1 1. The first step is a 
bimolecular interaction between the enzyme and substrate to form an ES complex. At 
high substrate concentrations (right-hand side of the curve in Figure 5.4) the initial ve- 
locity doesn’t change very much as more S is added. This indicates that the amount of 
enzyme has become rate-limiting in the reaction. The concentration of enzyme is an 
important component of the overall reaction as expected for formation of an ES 
complex. At low substrate concentrations (left-hand side of the curve in Figure 5.4), the 
initial velocity is very sensitive to changes in the substrate concentration. Under these 
conditions most enzyme molecules have not yet bound substrate and the formation of 
the ES complex depends on the substrate concentration. 

The shape of the v 0 vs. [S] curve is that of a rectangular hyperbola. Hyperbolic 
curves indicate processes involving simple dissociation as we saw for the dissociation of 
oxygen from oxymyoglobin (Section 4.13B). This is further evidence that the simple re- 
action under study is bimolecular involving the association of E and S to form an ES 
complex. The equation for a rectangular hyperbola is 


y = VT~ x 


where a is the asymptote of the curve (the value of y at an infinite value of x) and b is 
the point on the x axis corresponding to a value of a/2. In enzyme kinetic experiments, 
y - v 0 and x = [S]. The asymptote value (a) is called l/ max . It’s the maximum velocity of 
the reaction at infinitely large substrate concentrations. We often show the V max value 
on v 0 vs. [S] plots but if you look at the figure it’s not obvious why this particular as- 
ymptote was chosen. One of the characteristics of hyperbolic curves is that the curve 
seems to flatten out at moderate substrate concentrations at a level that seems far less 
than the V^ax value. The true Vm ax is n °t determined by trying to estimate the position 
of the asymptote from the shape of the curve; instead, it is precisely and correctly deter- 
mined by fitting the data to the general equation for a rectangular hyperbola. 

The b term in the general equation for a rectangular hyperbola is called the 
Michaelis constant (K m ) defined as the concentration of substrate when v 0 is equal to 
one -half Vm ax (Figure 5.4b). The complete rate equation is 


/C m + [S] 


This is called the Michaelis-Menten equation, named after Leonor Michaelis and Maud 
Menten. Note how the general form of the equation compares to Equation 5.12. The 
Michaelis-Menten equation describes the relationship between the initial velocity of a 
reaction and the substrate concentration. In the following section we derive the 
Michaelis-Menten equation by a kinetic approach and then consider the meaning of 
the various constants. 

A. Derivation of the Michaelis-Menten Equation 

One common derivation of the Michaelis-Menten equation is termed the steady state 
derivation. It was proposed by George E. Briggs and J. B. S. Haldane. This derivation 
postulates a period of time (called the steady state) during which the ES complex is 
formed at the same rate that it decomposes so that the concentration of ES is constant. 
The initial velocity is used in the steady state derivation because we assume that the 
concentration of product ( [P] ) is negligible. The steady state is a common condition for 
metabolic reactions in cells. 

If we assume a constant steady state concentration of ES then the rate of formation 
of product depends on the rate of the chemical reaction and the rate of dissociation of P 
from the enzyme. The rate limiting step is the right-hand side of Reaction 5.11 and the 
velocity depends on the rate constant k 2 and the concentration of ES. 

ES — E + P v 0 = k 2 [ES] (5.14) 


0 [Si 


▲ Figure 5.4 

Plots of initial velocity (v 0 ) versus substrate 
concentration ([S]) for an enzyme-catalyzed 
reaction, (a) Each experimental point is 
obtained from a separate progress curve 
using the same concentration of enzyme. 
The shape of the curve is hyperbolic. At 
low substrate concentrations, the curve ap- 
proximates a straight line that rises steeply. 
In this region of the curve, the reaction is 
highly dependent on the concentration of 
substrate. At high concentrations of sub- 
strate, the enzyme is almost saturated, and 
the initial rate of the reaction does not 
change much when substrate concentration 
is further increased, (b) The concentration 
of substrate that corresponds to half-maxi- 
mum velocity is called the Michaelis con- 
stant (K m ). The enzyme is half-saturated 
when S = K m . 

142 CHAPTER 5 Properties of Enzymes 

▲ Leonor Michaelis (1875-1949). 

The steady-state derivation solves Equation 5.14 for [ES] using terms that can be meas- 
ured such as the rate constant, the total enzyme concentration ([E] tota i), and the sub- 
strate concentration ([S]). [S] is assumed to be greater than [E] tota i but not necessarily 
saturating. For example, soon after a small amount of enzyme is mixed with substrate [ES] 
becomes constant because the overall rate of decomposition of ES (the sum of the rates 
of conversion of ES to E + S and to E + P) is equal to the rate of formation of the ES 
complex from E + S. The rate of formation of ES from E + S depends on the concentra- 
tion of free enzyme (enzyme molecules not in the form of ES) which is [E] tota i — [ES]. 
The concentration of the ES complex remains constant until consumption of S causes 
[S] to approach [E] tota p We can express these statements as a mathematical equation. 

Rate of ES formation = Rate of ES decomposition 

*l([E]total - [ES])[S] = (*_, + * 2 )[ES] 

Equation 5.15 is rearranged to collect the rate constants. 

k-i+k 2 _ _ l[E]totai - [ES]2[S] 

_ ki " m ” [ES f 



The ratio of rate constants on the left-hand side of Equation 5.16 is the Michaelis con- 
stant, K m . Next, this equation is solved for [ES] in several steps. 

[ES ]K m = ([E] tota | - [ES])[S] (5.17) 


[ES]K m = ([E] tota |[S]) - ([ES][S]) (5.18) 

Collecting [ES] terms, 


[ES](K m + [S]) = [E] tota |[S] 

K m + [S] 



▼ Maud Menten (1879-1960). 


J An outstanding medical scientist. Maud Menten was born in 
Port Lamb ton. She graduated in medicine from the University 
of Toronto in 1907 and four years later he came one of (he first 
Canadian women to receive a medical doctorate. In 19T5 in 

! Germany collaboration with Leonor Michaelis on the' behaviour 
ot enzymes resulted in the Michaelis -Menten equation, a basic 
biochemical concept which brought them international rccog- 

! nition. Menten continued her brilliant career as a pathologist 
at the University of Pittsburgh from 19 18* publishing exten- 
sively on medical and biochemical subjects. Her many achieve- 
ments included important co-discoveries relating to blood sugar, 
I haemoglobin, and kidney functions. Between 1951 pud 1954 
I she conducted cancer research in British Columbia and re- 
I turned to Ontario six years before she died. 

br .Sp 0**19 H., Twxfafc* Kitirtu =1 C*l«w i*t O'**"* 

5.3 The Michael is-Menten Equation 


Equation 5.20 describes the steady-state ES concentration using terms that can be 
measured in an experiment. Substituting the value of [ES] into the velocity equation 
(Equation 5.14) gives 

V'o = MES] = 


K m + [S] 


As indicated by Figure 5.4a, when the concentration of S is very high the enzyme is 
saturated and essentially all the molecules of E are present as ES. Adding more S has al- 
most no effect on the reaction velocity. The only way to increase the velocity is to add 
more enzyme. Under these conditions the velocity is at its maximum rate (Umax) and 
this velocity is determined by the total enzyme concentration and the rate constant k 2 . 
Thus, by definition, 

Knax ^2[E]total 


Substituting this in Equation 5.21 gives the most familiar form of the 
Michaelis-Menten equation. 

^0 = 


/C m + [S] 



The constant /r cat is the number of moles 
of substrate converted to product per 
second per mole of enzyme. 

We’ve already seen that this form of the Michaelis-Menten equation adequately de- 
scribes the data from kinetic experiments. In this section we’ve shown that the same 
equation can be derived from a theoretical consideration of the implications of Reac- 
tion 5.11, the equation for an enzyme -catalyzed reaction. The agreement between the- 
ory and data gives us confidence that the theoretical basis of enzyme kinetics is sound. 

B. The Catalytic Constant /r cat 

At high substrate concentration, the overall velocity of the reaction is V max and the rate 
is determined by the enzyme concentration. The rate constant observed under these 
conditions is called the catalytic constant, /r cat , defined as 

Knax = ^cat[E]total ^cat = ^ ~ (5.24) 

where fc cat represents the number of moles of substrate converted to product per second 
per mole of enzyme (or per mole of active site for a multisubunit enzyme) under satu- 
rating conditions. In other words, fc cat indicates the maximum number of substrate 
molecules converted to product each second by each active site. This is often called 
the turnover number. The catalytic constant measures how quickly a given enzyme can 
catalyze a specific reaction — it’s a very useful way of describing the effectiveness of 
an enzyme. The unit for fc cat is s _1 and the reciprocal of fc cat is the time required for 
one catalytic event. Note that the enzyme concentration must be known in order to 
calculate fc cat . 

For a simple reaction, such as Reaction 5.1 1, the rate-limiting step is the conversion 
of substrate to product and the dissociation of product from the enzyme (ES — > E + P). 
Under these conditions fc cat is equal to k 2 (Equation 5.14). Many enzyme reactions are 
more complex. If one step is clearly rate-limiting then its rate constant is the fc cat for that 
reaction. If the mechanism is more complex then fc cat may be a combination of several 
different rate constants. This is why we need a different rate constant (fc cat ) to describe 
the overall rate of the enzyme -catalyzed reaction. In most cases you can assume that fc cat 
is a good approximation of k 2 . 

Representative values of fc cat are listed in Table 5.1. Most enzymes are potent catalysts 
with fc cat values of 10 2 to 10 3 s _1 . This means that at high substrate concentrations a single 

Table 5.1 Examples of catalytic constants 


*cat(s V 




10 2 


10 2 


10 2 (to 10 3 ) 


10 3 


10 3 


10 3 


10 3 

Carbonic anhydrase 

10 6 

Superoxide dismutase 

10 6 


10 7 

*The catalytic constants are given only as orders 
of magnitude. 

144 CHAPTER 5 Properties of Enzymes 

▲ Substrate binding. Pyruvate carboxylase 
binds pyruvate, HC0 3 “ and ATP. The 
structure of the active site of the yeast 
( Saccharomyces cerevisiae) enzyme is 
shown here with a bound molecule of 
pyruvate (space-filling representation) and 
the cofactor biotin (bal l-and-stick). The K m 
value for pyruvate binding is 4 x 1CT 4 M. 
The K m values for HC 03 ~ and ATP binding 
are 1 x 1CT 3 M and 6 x 1CT 5 M. 

[PDB 2VK1] 

enzyme molecule will convert 100-1000 molecules of substrate to product every second. 
This rate is limited by a number of factors that will be discussed in the next chapter 
(Chapter 6: Mechanisms of Enzymes). 

Some enzymes are extremely rapid catalysts with k cat values of 10 6 s _1 or greater. 
Mammalian carbonic anhydrase, for example, must act very rapidly in order to main- 
tain equilibrium between aqueous C0 2 and bicarbonate (Section 2.10). As we will see in 
Section 6.4B, superoxide dismutase and catalase are responsible for rapid decomposi- 
tion of the toxic oxygen metabolites superoxide anion and hydrogen peroxide, respec- 
tively. Enzymes that catalyze a million reactions per second often act on small substrate 
molecules that diffuse rapidly inside the cell. 

C. The Meanings of K m 

The Michaelis constant has a number of meanings. Equation 5.16 defined K m as the 
ratio of the combined rate constants for the breakdown of ES divided by the constant 
for its formation. If the rate constant for product formation ( k 2 ) is much smaller than 
either k x or k- X , as is often the case, k 2 can be neglected and K m is equivalent to k-i/k x . 
In this case K m is the same as the equilibrium constant for dissociation of the ES com- 
plex to E +S. Thus, K m becomes a measure of the affinity of E for S. The lower the value 
of K m , the more tightly the substrate is bound. K m is also one of the parameters that 
determines the shape of the v 0 vs. [S] curve shown in Figure 5.4b. It is the substrate con- 
centration when the initial velocity is one-half the V max value. This meaning follows 
directly from the general equation for a rectangular hyperbola. 

K m values are sometimes used to distinguish between different enzymes that cat- 
alyze the same reaction. For example, mammals have several different forms of lactate 
dehydrogenase, each with a distinct K m value. Although it is useful to think of K m 
as representing the equilibrium dissociation constant for ES, this is not always valid. 
For many enzymes K m is a more complex function of the rate constants. This is espe- 
cially true when the reaction occurs in more than two steps. 

Typical K m values for enzymes range from 10 -2 to 10 -5 M. Since these values often 
represent apparent dissociation constants their reciprocal is an apparent association 
(binding) constant. You can see by comparison with protein-protein interactions 
(Section 4.9) that the binding of enzymes to substrates is much weaker. 


K m is the substrate concentration when 
the rate of the reaction is one-half the 
I/max value. It is often an approximation of 
the equilibrium dissociation constant of 
the reaction ES E + S. 

5.4 Kinetic Constants Indicate Enzyme Activity 
and Catalytic Proficiency 

We’ve seen that the kinetic constants K m and k CdLt can be used to gauge the relative activ- 
ities of enzymes and substrates. In most cases, K m is a measure of the stability of the ES 
complex and k Q2X is similar to the rate constant for the conversion of ES to E + P when 
the substrate is not limiting (region A in Figure 5.5). Recall that k cat is a measure of the 
catalytic activity of an enzyme indicating how many reactions a molecule of enzyme 
can catalyze per second. 

Examine region B of the hyperbolic curve in Figure 5.5. The concentration of S is 
very low and the curve approximates a straight line. Under these conditions, the reac- 
tion rate depends on the concentrations of both substrate and enzyme. In chemical 
terms, this is a second-order reaction and the velocity depends on a second-order rate 
constant defined by 

v 0 = *[E][S] (5.25) 

We are interested in knowing how to determine this second- order rate constant since it 
tells us the rate of the enzyme -catalyzed reaction under physiological conditions. When 
Michaelis and Menten first wrote the full rate equation they used the form that included 
k cat [E\ total rather than U max (Equation 5.24). Now that we understand the meaning of /c cat 

5.5 Measurement of K m and k max 145 

▲ Figure 5.5 Meanings of /r cat and k ca ^/K m . The catalytic constant (/r cat ) is the rate constant for con- 
version of the ES complex to E + P. It is measured most easily when the enzyme is saturated with 
substrate (region A on the Michael is-Menten curve shown). The ratio k cat /K m is the rate constant 
for the conversion of E + S to E + P at very low concentrations of substrate (region B). The reac- 
tions measured by these rate constants are summarized below the graph. 

we can substitute fc cat [E] total i n the Michaelis-Menten equation (Equation 5.23) in place 
of V max . If we consider only the region of the Michaelis-Menten curve at a very low [S] 
then this equation can be simplified by neglecting the [S] in the denominator since [S] 
is much less than K m . 

+ [S] 




Comparing Equations 5.25 and 5.26 reveals that the second-order rate constant is 
closely approximated by k cat /K m . Thus, the ratio k cat /K m is an apparent second-order 
rate constant for the formation of E + P from E + S when the overall reaction is limited 
by the encounter of S with E. This ratio approaches 10 8 to 10 9 M -1 s _1 , the fastest rate at 
which two uncharged solutes can approach each other by diffusion at physiological 
temperature. Enzymes that can catalyze reactions at this extremely rapid rate are dis- 
cussed in Section 6.4. 

The k cat /K m ratio is useful for comparing the activities of different enzymes. It is 
also possible to assess the efficiency of an enzyme by measuring its catalytic proficiency. 
This value is equal to the rate constants for a reaction in the presence of the enzyme 
( k cat /K m ) divided by the rate constant for the same reaction in the absence of the en- 
zyme (fc n ). Surprisingly few catalytic proficiency values are known because most chemi- 
cal reactions occur extremely slowly in the absence of enzymes — so slowly that their 
nonenzymatic rates are very difficult to measure. The reaction rates are often measured 
in special steel-enclosed glass vessels at temperatures in excess of 300°C. 

Table 5.2 lists several examples of known catalytic proficiencies. Typical values 
range from 10 14 to 10 20 but some are quite a bit higher (up to 10 24 ). The current record 
holder is uroporphyrinogen decarboxylase, an enzyme required for a step in the por- 
phyrin synthesis pathway. The difficulty in obtaining rate constants for nonenzymatic 
reactions is illustrated by the half-life for the uncatalyzed reaction — about 2 billion 
years! The catalytic proficiency values in Table 5.2 emphasize one of the main properties 
of enzymes, namely, their ability to increase the rates of reactions that would normally 
occur too slowly to be useful. 

5.5 Measurement of K m and V max 

The kinetic parameters of an enzymatic reaction can provide valuable information about 
the specificity and mechanism of the reaction. The key parameters are K m and V max 
because fc cat can be calculated if V max is known. 

146 CHAPTER 5 Properties of Enzymes 

Table 5.2 Catalytic proficiencies of some enzymes 

rate constant 
(fc„ in s' 1 ) 

Enzymatic rate 
constant ( k cat /K m 
in M 's 1 ) 



Carbonic anhydrase 

10" 1 

7 X 10 6 

7 X 10 7 


4 x icr 9 

9 X 10 7 

2 X 10 16 

Chorismate mutase 

1(T 5 

2 X 10 6 

2 X 10 11 

Triose phosphate isomerase 

4 x icr 6 

4 X 10 8 

10 14 

Cytidine deaminase 

10 -i° 

3 X 10 6 

3 X 10 16 

Adenosine deaminase 

2 X 1(T 10 

10 7 

5 X 10 16 

Mandelate racemase 

3 x icr 13 

10 6 

3 X 10 18 


7 X 1(T 14 

10 7 

10 20 


icr 13 

1 0 9 

10 21 

Arginine decarboxylase 

9 x icr 16 

10 6 

10 21 

Alkaline phosphatase 

icr 15 

3 X 10 7 

3 X 10 22 

Orotidine 5'-phosphate 

3 x icr 16 

6 x 10 7 

2 X 10 23 



1 o -17 

2 X 10 7 

2 X 10 24 

K m and V max for an enzyme -catalyzed reaction can be determined in several ways. 
Both values can be obtained by the analysis of initial velocities at a series of substrate 
concentrations and a fixed concentration of enzyme. In order to obtain reliable values 
for the kinetic constants the [S] points must be spread out both below and above K m to 
produce a hyperbola. It is difficult to determine either K m or V max directly from a graph 

▲ Maximum catalytic proficiency. Uropor- 
phyrinogen decarboxylase is the current 
record holder for maximum catalytic profi- 
ciency. It catalyzes a step in the heme syn- 
thesis pathway. The enzyme shown here is a 
human (Homo sapiens) variant with a bound 
porphoryrin molecule at the active site of 
each monomer. [PDB 2Q71] 


We have seen that a plot of substrate concentration ([S]) 
versus the initial velocity of a reaction (v 0 ) produces a hy- 
perbolic curve as shown in Figures 5.4 and 5.5. The general 
equation for a rectangular hyperbola (Equation 5.12) and 
the Michaelis-Menten equation have the same form 
(Equation 5.13). 

Its very difficult to determine V max from a plot of enzyme 
kinetic data since the hyperbolic curve that shows the relation- 
ship between substrate concentration and initial velocity is as- 
ymptotic to V max and it is experimentally difficult to achieve 
the concentration of substrate required to estimate V max . For 
these reasons, it is often easier to convert the hyperbolic curve 
to a linear form that matches the general formula y - mx + b, 
where m is the slope of the line and b is the y-axis intercept. 
The first step in transforming the original Michaelis-Menten 
equation to this general form of a linear equation is to invert 
the terms so that the K m + [S] term is on top of the right-hand 
side. This is done by taking the reciprocal of each side — a 
transformation that will be familiar to many who are familiar 
with hyperbolic curves. 

The next two steps involve separating terms and cancel- 
ing [S] in the second term on the right-hand side of the 
equation. This form of the Michaelis-Menten equation is 
called the Lineweaver-Burk equation and it resembles the 
general form of a linear equation, y - mx + b , where y is the 
reciprocal of v 0 and x values are the reciprocal of [S]. A plot 
of data in this form is referred to as a double-reciprocal plot. 
The slope of the line will be K m /V max and the y-axis intercept 
will be W max . 

The original reason for this sort of transformation was 
to calculate K m and V max from experimental data. It was eas- 
ier to plot the reciprocal values of v 0 and [S] and draw a 
straight line through the points in order to calculate the ki- 
netic constants. Nowadays, there are computer programs that 
can accurately fit the data to a hyperbolic curve and calculate 
the constants so the Lineweaver-Burk plot is no longer nec- 
essary for this type of analysis. In this book we will still use 
the Lineweaver-Burk plots to illustrate some general features 
of enzyme kinetics but they are rarely used for their original 
purpose of data analysis. 

1 = K m + [S] ^ = K m + [S] = ^ I< m b 1 + 

v o y max [s] v 0 y max [s] v max [S] v 0 kmax [S] V m 

5.6 Kinetics of Multisubstrate Reactions 147 

of initial velocity versus concentration because the curve approaches V max asymptoti- 
cally. However, accurate values can be determined by using a suitable computer pro- 
gram to fit the experimental results to the equation for the hyperbola. 

The Michaelis-Menten equation can be rewritten in order to obtain values for V max 
and K m from straight lines on graphs. The most commonly used transformation is the 
double-reciprocal, or Lineweaver-Burk, plot in which the values of l/v 0 are plotted 
against 1/[S] (Figure 5.6 ). The absolute value of 1 /K m is obtained from the intercept of 
the line at the x axis, and the value of 1/V max is obtained from the y intercept. Although 
double-reciprocal plots are not the most accurate methods for determining kinetic con- 
stants, they are easily understood and provide recognizable patterns for the study of en- 
zyme inhibition, an extremely important aspect of enzymology that we will examine 

Values of fc cat can be obtained from measurements of V max only when the absolute 
concentration of the enzyme is known. Values of K m can be determined even when en- 
zymes have not been purified provided that only one enzyme in the impure preparation 
can catalyze the observed reaction. 

Lineweaver-Burk equation: 

J_ = (^m|l + _J_ 

5.6 Kinetics of Multisubstrate Reactions 

Until now, we have only been considering reactions where a single substrate is con- 
verted to a single product. Let’s consider a reaction in which two substrates, A and B, are 
converted to products P and Q. 

▲ Figure 5.6 

Double-reciprocal (Lineweaver-Burk) plot. 

This plot is derived from a linear transforma- 
tion of the Michaelis-Menten equation. 
Values of 1/vq are plotted as a function of 
1/[S] values. 

E + A + B (EAB) -> E + P + Q (5.27) 

Kinetic measurements for such multisubstrate reactions are a little more complicated 
than simple one-substrate enzyme kinetics. For many purposes, such as designing an 
enzyme assay, it’s sufficient simply to determine the K m for each substrate in the pres- 
ence of saturating amounts of each of the other substrates as we described for chemi- 
cal reactions (Section 5.2A). The simple enzyme kinetics discussed in this chapter can 
be extended to distinguish among several mechanistic possibilities for multisubstrate 
reactions, such as group transfer reactions. This is done by measuring the effect of 
variations in the concentration of one substrate on the kinetic results obtained for the 

Multisubstrate reactions can occur by several different kinetic schemes. These 
schemes are called kinetic mechanisms because they are derived entirely from kinetic 
experiments. Kinetic mechanisms are commonly represented using the notation intro- 
duced by W. W. Cleland. The sequence of steps proceeds from left to right (Figure 5.7). 
The addition of substrate molecules (A, B, C, . . .) to the enzyme and the release of 
products (P, Q, R, . . .) from the enzyme are indicated by arrows pointing toward 
(substrate binding) or from (product release) the line. The various forms of the en- 
zyme (free E, ES complexes, or EP complexes) are written under a horizontal line. The 
ES complexes that undergo chemical transformation when the active site is filled are 
shown in parentheses. 

Sequential reactions (Figure 5.7a) require all the substrates to be present before any 
product is released. Sequential reactions can be either ordered, with an obligatory order 
for the addition of substrates and release of products, or random. In ping-pong reactions 
(Figure 5.7b), a product is released before all the substrates are bound. In a bisubstrate 
ping-pong reaction, the first substrate is bound, the enzyme is altered by substitution, 
and the first product is released. Then the second substrate is bound, the altered enzyme 
is restored to its original form, and the second product is released. A ping-pong mecha- 
nism is sometimes called a substituted-enzyme mechanism because of the covalent 
binding of a portion of a substrate to the enzyme. The binding and release of ligands in 
a ping-pong mechanism are usually indicated by slanted lines. The two forms of the en- 
zyme are represented by E (unsubstituted) and F (substituted). 


CHAPTER 5 Properties of Enzymes 

Irreversible inhibitors are described in 
Section 5.8. 


Reversible inhibitors bind to enzymes 
and either prevent substrate binding or 
block the reaction leading to formation 
of product. 

(a) Sequential reactions 

A B P Q 

A A 

V T 



A B P Q 

i eb i i ep i 



(b) Ping-pong reaction 

E (EA)(FP) F (FB)(EQ) E 

▲ Figure 5.7 

Notation for bisubstrate reactions, (a) In sequential reactions, all substrates are bound before a product 
is released. The binding of substrates may be either ordered or random, (b) In ping-pong reactions, 
one substrate is bound and a product is released, leaving a substituted enzyme. A second substrate 
is then bound and a second product released, restoring the enzyme to its original form. 

5.7 Reversible Enzyme Inhibition 

An enzyme inhibitor (I) is a compound that binds to an enzyme and interferes with its 
activity. Inhibitors can act by preventing the formation of the ES complex or by block- 
ing the chemical reaction that leads to the formation of product. As a general rule, 
inhibitors are small molecules that bind reversibly to the enzyme they inhibit. Cells 
contain many natural enzyme inhibitors that play important roles in regulating me- 
tabolism. Artificial inhibitors are used experimentally to investigate enzyme mecha- 
nisms and decipher metabolic pathways. Some drugs, and many poisons, are enzyme 

Some inhibitors bind covalently to enzymes causing irreversible inhibition but 
most biologically relevant inhibition is reversible. Reversible inhibitors are bound to 
enzymes by the same weak, noncovalent forces that bind substrates and products. 
The equilibrium between free enzyme (E) plus inhibitor (I) and the El complex is 
characterized by a dissociation constant. In this case, the constant is called the 
inhibition constant,^. 

E + | — El K d = K; = ^ (5.28) 

The basic types of reversible inhibition are competitive, uncompetitive, noncom- 
petitive and mixed. These can be distinguished experimentally by their effects on the ki- 
netic behavior of enzymes (Table 5.3). Figure 5.8 shows diagrams representing modes 
of reversible enzyme inhibition. 

5.7 Reversible Enzyme Inhibition 149 

Table 5.3 Effects of reversible inhibitors on kinetic constants 

Type of inhibitor 


Competitive (1 binds to E only) 

Raises K m 

V max remains unchanged 

Uncompetitive (1 binds to ES only) 

Lowers V max and K m 

Ratio of V max /K m remains unchanged 

Noncompetitive (1 binds to E or ES) 

Lowers V max 
K m remains unchanged 

A. Competitive Inhibition 

Competitive inhibitors are the most commonly encountered inhibitors in biochem- 
istry. In competitive inhibition, the inhibitor can bind only to free enzyme molecules 
that have not bound any substrate. Competitive inhibition is illustrated in Figure 5.8 
and by the kinetic scheme in Figure 5.9a. In this scheme only ES can lead to the for- 
mation of product. The formation of an El complex removes enzyme from the nor- 
mal pathway. 

Once a competitive inhibitor is bound to an enzyme molecule, a substrate mole- 
cule cannot bind to that enzyme molecule. Conversely, the binding of substrate to an 
enzyme molecule prevents the binding of an inhibitor. In other words, S and I compete 
for binding to the enzyme molecule. Most commonly, S and I bind at the same site on 
the enzyme, the active site. This type of inhibition is termed classical competitive inhi- 
bition (Figure 5.8). This is not the only kind of competitive inhibition (see Figure 5.8). 
In some cases, such as allosteric enzymes (Section 5.10), the inhibitor binds at a differ- 
ent site and this alters the substrate binding site preventing substrate binding. This 
type of inhibition is called nonclassical competitive inhibition. When both I and S are 

(a) Classical competitive inhibition (b) Nonclassical competitive inhibition 

co = db - p- 1 *! 

▲ Competitive inhibition. The active 
ingredient in the weed killer Roundup® is 
glyphosate, a competitive inhibitor of the 
plant enzyme 5-enolpyruvylshikimate-3- 
phosphate synthase. (See Box 17.2 in 
Chapter 17.) 

The substrate (S) and the inhibitor The binding of substrate (S) at the active 

(I) compete for the same site on site prevents the binding of inhibitor (I) 

the enzyme. at a separate site and vice versa. 

(c) Uncompetitive inhibition 


(d) Noncompetitive inhibition 


The inhibitor (I) binds only to the 
enzyme substrate (ES) complex 
preventing the conversion of 
substrate (S) to product. 

The inhibitor (I) can bind to either E or 
ES. The enzyme becomes inactive when 
I binds. Substrate (S) can still bind to 
the El complex but conversion to 
product is inhibited. 

◄ Figure 5.8 

Diagrams of reversible enzyme inhibition. In 

this scheme, catalytically competent enzymes 
are green and inactive enzymes are red. 


CHAPTER 5 Properties of Enzymes 


k i 

E + S < » ES 

+ ^-i 


E + P 



▲ Figure 5.9 

Competitive inhibition, (a) Kinetic scheme illustrating the binding of I to E. Note that this is an ex- 
pansion of Equation 5.11 that includes formation of the El complex, (b) Double-reciprocal plot. In 
competitive inhibition, l/ max remains unchanged and K m increases. The black line labeled “Control” 
is the result in the absence of inhibitor. The red lines are the results in the presence of inhibitor, 
with the arrow showing the direction of increasing [I]. 

▲ Ibuprofen, the active ingredient in many 
over-the-counter painkillers, is a competitive 
inhibitor of the enzyme cyclooxygenase 
(COX). (See Box 16.1 Chapter 16.) 

coo e 


C H2 

CH 2 

COO 0 



< f H2 , 




present in a solution, the proportion of the enzyme that is able to form ES complexes 
depends on the concentrations of substrate and inhibitor and their relative affinities 
for the enzyme. 

The amount of El can be reduced by increasing the concentration of S. At suffi- 
ciently high concentrations the enzyme can still be saturated with substrate. Therefore, 
the maximum velocity is the same in the presence or in the absence of an inhibitor. 
The more competitive inhibitor present, the more substrate needed for half- saturation. 
We have shown that the concentration of substrate at half- saturation is K m . In the pres- 
ence of increasing concentrations of a competitive inhibitor, K m increases. The new 
value is usually referred to as the apparent (X^ p ). On a double-reciprocal plot, 
adding a competitive inhibitor shows as a decrease in the absolute value of the intercept 
at the x axis 1 /K m , whereas the y intercept VV max remains the same (Figure 5.9b). 

Many classical competitive inhibitors are substrate analogs — compounds that are 
structurally similar to substrates. The analogs bind to the enzyme but do not react. 
For example, the enzyme succinate dehydrogenase converts succinate to fumarate 
(Section 13.3#6). Malonate resembles succinate and acts as a competitive inhibitor of 
the enzyme. 

B. Uncompetitive Inhibition 

Uncompetitive inhibitors bind only to ES and not to free enzyme (Figure 5.10a). In 
uncompetitive inhibition, V max is decreased (W max is increased) by the conversion of some 
molecules of E to the inactive form ESI. Since it is the ES complex that binds I, the de- 
crease in V max is not reversed by the addition of more substrate. Uncompetitive in- 
hibitors also decrease the K m (seen as an increase in the absolute value of 1 /K m on a 
double- reciprocal plot) because the equilibria for the formation of both ES and ESI are 
shifted toward the complexes by the binding of I. Experimentally, the lines on a double- 
reciprocal plot representing varying concentrations of an uncompetitive inhibitor all 
have the same slope indicating proportionally decreased values for K m and U max (Figure 
5.10b). This type of inhibition usually occurs only with multisubstrate reactions. 

C. Noncompetitive Inhibition 

Noncompetitive inhibitors can bind to E or ES forming inactive El or ESI complexes, re- 
spectively (Figure 5.11a). These inhibitors are not substrate analogs and do not bind at 
the same site as S. The classic case of noncompetitive inhibition is characterized by an 

5.7 Reversible Enzyme Inhibition 151 


E + S < > ES > E + P 




K i 




E + S ES 

+ + 

I I 




El + S ESI 

E + P 

apparent decrease in V max ( W max appears to increase) with no change in K m . On a 
double-reciprocal plot, the lines for classic noncompetitive inhibition intersect at the 
point on the x axis corresponding to 1 /K m (Figure 5.1 lb). The common x-axis intercept 
indicates that K m isn’t affected. The effect of noncompetitive inhibition is to reversibly 
titrate E and ES with I removing active enzyme molecules from solution. This inhibi- 
tion cannot be overcome by the addition of S. Classic noncompetitive inhibition is rare 
but examples are known among allosteric enzymes. In these cases, the noncompetitive 
inhibitor probably alters the conformation of the enzyme to a shape that can still bind S 
but cannot catalyze any reaction. 

Most enzymes do not conform to the classic form of noncompetitive inhibition 
where K m is unchanged. In most cases, both K m and V max are affected because the affin- 
ity of the inhibitor for E is different than its affinity for ES. These cases are often referred 
to as mixed inhibition (Figure 5.12). 

D. Uses of Enzyme Inhibition 

Reversible enzyme inhibition provides a powerful tool for probing enzyme activity. In- 
formation about the shape and chemical reactivity of the active site of an enzyme can be 
obtained from experiments involving a series of competitive inhibitors with systemati- 
cally altered structures. 

The pharmaceutical industry uses enzyme inhibition studies to design clinically 
useful drugs. In many cases, a naturally occurring enzyme inhibitor is used as the start- 
ing point for drug design. Instead of using random synthesis and testing of potential in- 
hibitors, some investigators are turning to a more efficient approach known as rational 
drug design. Theoretically, with the greatly expanded bank of knowledge about enzyme 
structure, inhibitors can now be rationally designed to fit the active site of a target 
enzyme. The effects of a synthetic compound are tested first on isolated enzymes and 
then in biological systems. However, even if a compound has suitable inhibitory activ- 
ity, other problems may be encountered. For example, the drug may not enter the target 
cells, may be rapidly metabolized to an inactive compound, may be toxic to the host or- 
ganism, or the target cell may develop resistance to the drug. 

◄ Figure 5.10 

Uncompetitive inhibition, (a) Kinetic scheme 
illustrating the binding of I to ES. 

(b) Double-reciprocal plot. In uncompetitive 
inhibition, both l/ max and K m decrease (i.e., 
the absolute values of both l/l/ max and 1/K m 
obtained from they and x intercepts, 
respectively, increase). The ratio KJ l/ max , 
the slope of the lines, remains unchanged. 

◄ Figure 5.1 1 

Classic noncompetitive inhibition, (a) Kinetic 
scheme illustrating the binding of I to E 
or ES. (b) Double-reciprocal plot. F max 
decreases, but K m remains the same. 


▲ Figure 5.12 

Double-reciprocal plot showing mixed Inhibi- 
tion. Both y max and K m are affected when 
the inhibitor binds with different affinities to 
E and ES. 

152 CHAPTER 5 Properties of Enzymes 



h 2 n 

▲ Figure 5.13 

Comparison of a substrate and a designed in- 
hibitor of purine nucleoside phosphorylase. 

The two substrates of this enzyme are 
guanosine and inorganic phosphate, (a) 
Guanosine. (b) A potent inhibitor of the en- 
zyme. N-9 of guanosine has been replaced 
by a carbon atom. The chlorinated benzene 
ring binds to the sugar-binding site of the 
enzyme, and the acetate side chain binds to 
the phosphate-binding site. 

The advances made in drug synthesis are exemplified by the design of a series of in- 
hibitors of the enzyme purine nucleoside phosphorylase. This enzyme catalyzes a 
degradative reaction between phosphate and the nucleoside guanosine whose structure 
is shown in Figure 5.13a. With computer modeling, the structures of potential in- 
hibitors were designed and fit into the active site of the enzyme. One such compound 
(Figure 5.13b) was synthesized and found to be 100 times more inhibitory than any 
compound made by the traditional trial- and-error approach. Researchers hope that the 
rational design approach will produce a drug suitable for treating autoimmune disor- 
ders such as rheumatoid arthritis and multiple sclerosis. 

5.8 Irreversible Enzyme Inhibition 

In contrast to a reversible enzyme inhibitor, an irreversible enzyme inhibitor forms a 
stable covalent bond with an enzyme molecule thus removing active molecules from 
the enzyme population. Irreversible inhibition typically occurs by alkylation or acylation 
of the side chain of an active-site amino acid residue. There are many naturally occur- 
ring irreversible inhibitors as well as the synthetic examples described here. 

An important use of irreversible inhibitors is the identification of amino acid 
residues at the active site by specific substitution of their reactive side chains. In this 
process, an irreversible inhibitor that reacts with only one type of amino acid is in- 
cubated with a solution of enzyme that is then tested for loss of activity. Ionizable 
side chains are modified by acylation or alkylation reactions. For example, free 
amino groups such as the e-amino group of lysine react with an aldehyde to form a 
Schiff base that can be stabilized by reduction with sodium borohydride (NaBH 4 ) 
(Figure 5.14). 

The nerve gas diisopropyl fluorophosphate (DFP) is one of a group of organic 
phosphorus compounds that inactivate hydrolases with a reactive serine as part of the 
active site. These enzymes are called serine proteases or serine esterases, depending on 
their reaction specificity. The serine protease chymotrypsin, an important digestive 
enzyme, is inhibited irreversibly by DFP (Figure 5.15). DFP reacts with the serine 
residue at chymotrypsin’s active site (Ser-195) to produce diisopropylphosphoryl- 

Some organophosphorus inhibitors are used in agriculture as insecticides; others, 
such as DFP, are useful reagents for enzyme research. The original organophosphorus 
nerve gases are extremely toxic poisons developed for military use. The major biological 
action of these poisons is irreversible inhibition of the serine esterase acetyl- 
cholinesterase that catalyzes hydrolysis of the neurotransmitter acetylcholine. When 
acetylcholine released from an activated nerve cell binds to its receptor on a second 
nerve cell, it triggers a nerve impulse. The action of acetylcholinesterase restores the cell 
to its resting state. Inhibition of this enzyme can cause paralysis. 


(CH 2 ) 4 

h 2 o 


(CH 2 ) 4 


(CH 2 ) 4 


nh 2 






NaBH 4 





h 2 o 

Schiff base 


▲ Figure 5.14 

Reaction of the e-amino group of a lysine residue with an aldehyde. Reduction of the Schiff base with 
sodium borohydride (NaBH 4 ) forms a stable substituted enzyme. 

5.9 Regulation of Enzyme Activity 153 

Figure 5.15 ► 

Irreversible Inhibition by DFP. Diisopropyl fluorophosphate (DFP) reacts with a single, highly nucle- 
ophilic serine residue (Ser-195) at the active site of chymotrypsin, producing inactive diisopropyl- 
phosphoryl-chymotrypsin. DFP inactivates serine proteases and serine esterases. 


5.9 Regulation of Enzyme Activity 

At the beginning of this chapter, we listed several advantages to using enzymes as catalysts 
in biochemical reactions. Clearly, the most important advantage is to speed up reactions 
that would otherwise take place too slowly to sustain life. One of the other advantages of 
enzymes is that their catalytic activity can be regulated in various ways. The amount of 
an enzyme can be controlled by regulating the rate of its synthesis or degradation. This 
mode of control occurs in all species but it often takes many minutes or hours to 
synthesize new enzymes or to degrade existing enzymes. 

In all organisms, rapid control — on the scale of seconds or less — can be accom- 
plished through reversible modulation of the activity of regulated enzymes. In this con- 
text, we define regulated enzymes as those enzymes whose activity can be modified in a 
manner that affects the rate of an enzyme -catalyzed reaction. In many cases, these regu- 
lated enzymes control a key step in a metabolic pathway. The activity of a regulated en- 
zyme changes in response to environmental signals, allowing the cell to respond to 
changing conditions by adjusting the rates of its metabolic processes. 

In general, regulated enzymes become more active catalysts when the concentra- 
tions of their substrates increase or when the concentrations of the products of their 
metabolic pathways decrease. They become less active when the concentrations of their 
substrates decrease or when the products of their metabolic pathways accumulate. Inhi- 
bition of the first enzyme unique to a pathway conserves both material and energy by 
preventing the accumulation of intermediates and the ultimate end product. The activity 
of regulated enzymes can be controlled by noncovalent allosteric modulation or covalent 

Allosteric enzymes are enzymes whose properties are affected by changes in struc- 
ture. The structural changes are mediated by interaction with small molecules. We saw 
an example of allostery in the previous chapter when we examined the binding of oxygen 
to hemoglobin. Allosteric enzymes often do not exhibit typical Michaelis-Menten kinet- 
ics due to cooperative binding of substrate, as is the case with hemoglobin. 

Figure 5.16 shows a v 0 versus [S] curve for an allosteric enzyme with cooperative 
binding of substrate. Sigmoidal curves result from the transition between two states of 
the enzyme. In the absence of substrate, the enzyme is in the T state. The conformation 
of each subunit is in a shape that binds substrate inefficiently and the rate of the reac- 
tion is slow. As substrate concentration is increased, enzyme molecules begin to bind 
substrate even though the affinity of the enzyme in the T state is low. When a subunit 
binds substrate, the enzyme undergoes a conformational change that converts the en- 
zyme to the R state and the reaction takes place. The kinetic properties of the enzyme 
subunit in the T state and the R state are quite different — each conformation by itself 
could exhibit standard Michaelis-Menten kinetics. 

The conformational change in the subunit that initially binds a substrate molecule 
affects the other subunits in the multisubunit enzyme. The conformations of these 
other subunits are shifted toward the R state where their affinity for substrate is much 
higher. They can now bind substrate at a much lower concentration than when they 
were in the T state. 

Allosteric phenomena are responsible for the reversible control of many regulated 
enzymes. In Section 4.13C, we saw how the conformation of hemoglobin and its affinity 
for oxygen change when 2,3-frisphosphoglycerate is bound. Many regulated enzymes 
also undergo allosteric transitions between active (R) states and inactive (T) states. 
These enzymes have a second ligand-binding site away from their catalytic centers 
called the regulatory site or allosteric site. An allosteric inhibitor or activator, also called an 
allosteric modulator or allosteric effector, binds to the regulatory site and causes a con- 
formational change in the regulated enzyme. This conformational change is transmitted 

h 3 C 

H — C — O 


H 3 C 



C H 3 

o — c — H 


ch 3 

Diisopropyl fluorophosphate 



CH 2 

h 3 c O ch 3 

I \ /-> I 

H— C— O— P— O— C— H 


CH 3 


H 3 C O ch 3 

I I I 

H— C— O— P— O— C— H 

H 3 C O ch 3 

Aspartate transcarbamoylase (ATCase), 
another well-characterized allosteric 
enzyme, is described in Chapter 18. 


▲ Figure 5.16 

Cooperativity. Plot of initial velocity as a 
function of substrate concentration for an 
allosteric enzyme exhibiting cooperative 
binding of substrate. 

154 CHAPTER 5 Properties of Enzymes 


Allosteric enzymes often have multiple 
subunits and substrate binding is 
cooperative. This produces a sigmoidal 
curve when velocity is plotted against 
substrate concentration. 

ch 2 oh 



HO — C — H 


H — C — OH 


H — C — OH 

I © 

ch 2 opo 3 ^ 

Fructose 6-phosphate 


Phosphofructokinase - 1 

ch 2 opo 3 ® 



HO — C — H 

+ H® 

H — C — OH + M 


H — C — OH 

I © 

ch 2 opo 3 ^ 

Fructose 1,6-b/sphosphate 

▲ Figure 5.17 

Reaction catalyzed by phosphofructokinase-1. 

to the active site of the enzyme, which changes shape sufficiently to alter its activity. The 
regulatory and catalytic sites are physically distinct regions of the protein — usually lo- 
cated on separate domains and sometimes on separate subunits. Allosterically regulated 
enzymes are often larger than other enzymes. 

First, we examine an enzyme that undergoes allosteric (noncovalent) regulation 
and then we list some general properties of such enzymes. Next, we describe two models 
that explain allosteric regulation in terms of changes in the conformation of regulated 
enzymes. Finally, we discuss a closely related group of regulatory enzymes — those subject 
to covalent modification. 


C — OPO,® 


ch 2 

▲ Figure 5.18 

Phosphoenolpyruvate. This intermediate of 
glycolysis is an allosteric inhibitor of phos- 
phofructokinase- 1 from Escherichia coli. 

A. Phosphofructokinase Is an Allosteric Enzyme 

Bacterial phosphofructokinase-1 ( Escherichia coli) provides a good example of allosteric 
inhibition and activation. Phosphofructokinase-1 catalyzes the ATP-dependent phos- 
phorylation of fructose 6-phosphate to produce fructose 1,6-frzsphosphate and ADP 
(Figure 5.17). This reaction is one of the first steps of glycolysis, an ATP-generating 
pathway for glucose degradation described in detail in Chapter 11. Phosphoenolpyruvate 
(Figure 5.18), an intermediate near the end of the glycolytic pathway, is an allosteric 
inhibitor of E. coli phosphofructokinase-1. When the concentration of phospho- 
enolpyruvate rises, it indicates that the pathway is blocked beyond that point. Further 
production of phosphoenolpyruvate is prevented by inhibiting phosphofructokinase- 1 
(see feedback inhibition, Section 10.2C). 

ADP is an allosteric activator of phosphofructokinase-1. This may seem strange from 
looking at Figure 5.17 but keep in mind that the overall pathway of glycolysis results in net 
synthesis of ATP from ADR Rising ADP levels indicate a deficiency of ATP and glycolysis 
needs to be stimulated. Thus, ADP activates phosphofructokinase-1 in spite of the fact 
that ADP is a product in this particular reaction. 

Phosphoenolpyruvate and ADP affect the binding of the substrate fructose 6-phos- 
phate to phosphofructokinase-1. Kinetic experiments have shown that there are four 
binding sites on phosphofructokinase-1 for fructose 6-phosphate and structural experi- 
ments have confirmed that E. coli phosphofructokinase-1 (M r 140,000) is a tetramer 
consisting of four identical subunits. Figure 5.19 shows the structure of the enzyme 
complexed with its products, fructose 1,6-fcphosphate and ADP, and a second mole- 
cule of ADP, an allosteric activator. Two of the subunits shown in Figure 5.19a associate 
to form a dimer. The two products are bound in the active site located between two do- 
mains of each chain — ADP is bound to the large domain and fructose 1,6-frisphosphate 
is bound mostly to the small domain. Two of these dimers interact to form the complete 
tetrameric enzyme. 

A notable feature of the structure of phosphofructokinase-1 (and a general feature 
of regulated enzymes) is the physical separation of the active site and the regulatory 

5.9 Regulation of Enzyme Activity 155 

site on each subunit. (In some regulated enzymes the active sites and regulatory sites 
are on different subunits.) The activator ADP binds at a distance from the active site in 
a deep hole between the subunits. When ADP is bound to the regulatory site, phospho- 
fructokinase-1 assumes the R conformation, which has a high affinity for fructose 6- 
phosphate. When the smaller compound phosphoenolpyruvate is bound to the same 
regulatory site the enzyme assumes a different conformation, the T conformation, 
which has a lower affinity for fructose 6-phosphate. The transition between conforma- 
tions is accomplished by a slight rotation of one rigid dimer relative to the other. The 
cooperativity of substrate binding is tied to the concerted movement of an arginine 
residue in each of the four fructose 6-phosphate binding sites located near the inter- 
face between the dimers. Movement of the side chain of this arginine from the active 
site lowers the affinity for fructose 6-phosphate. In many organisms, phosphofructoki- 
nase-1 is larger and is subject to more complex allosteric regulation than in E. coli as 
you will see in Chapter 1 1 . 

Activators can affect either V max or K m or both. Its important to recognize that the 
binding of an activator alters the structure of an enzyme and this alteration converts it 
to a different form that may have quite different kinetic properties. In most cases, the 
differences between the kinetic properties of the R and T forms are more complex than 
the differences we saw with enzyme inhibitors in Section 5.7. 

B. General Properties of Allosteric Enzymes 

Examination of the kinetic and physical properties of allosteric enzymes has shown that 
they have the following general features: 

1. The activities of allosteric enzymes are changed by metabolic inhibitors and activa- 
tors. Often these allosteric effectors do not resemble the substrates or products of 
the enzyme. For example, phosphoenolpyruvate (Figure 5.18) resembles neither 
the substrate nor the product (Figure 5.17) of phosphofructokinase. Consideration 
of the structural differences between substrates and metabolic inhibitors originally 
led to the conclusion that allosteric effectors are bound to regulatory sites separate 
from catalytic sites. 

2. Allosteric effectors bind noncovalently to the enzymes they regulate. (There is a 
special group of regulated enzymes whose activities are controlled by covalent 
modification, described in Section 5.10D.) Many effectors alter the K m of the en- 
zyme for a substrate; but some alter the V max . Allosteric effectors themselves are not 
altered chemically by the enzyme. 

3. With few exceptions, regulated enzymes are multisubunit proteins. (But not all 
multisubunit enzymes are regulated.) The individual polypeptide chains of a 
regulated enzyme may be identical or different. For those with identical sub- 
units (such as phosphofructokinase- 1 from E. coli), each polypeptide chain can 
contain both the catalytic and regulatory sites and the oligomer is a symmetric 
complex, most often possessing two or four protein chains. Regulated enzymes 
composed of nonidentical subunits have more complex, but usually symmetric, 

4. An allosterically regulated enzyme usually has at least one substrate for which the 
v 0 versus [S] curve is sigmoidal rather than hyperbolic (Section 5.9). Phospho- 
fructokinase- 1 exhibits Michaelis-Menten (hyperbolic) kinetics with respect to 
one substrate, ATP, but sigmoidal kinetics with respect to its other substrate, fruc- 
tose 6-phosphate. A sigmoidal curve is caused by positive cooperativity of sub- 
strate binding and this is made possible by the presence of multiple substrate 
binding sites in the enzyme — four binding sites in the case of tetrameric phospho- 
fructokinase- 1. 

The allosteric R v T transition between the active and the inactive conformations 
of a regulatory enzyme is rapid. The ratio of R to T is controlled by the concentrations of 
the various ligands and the relative affinities of each conformation for these ligands. In 
the simplest cases, substrate and activator molecules bind only to enzyme in the R state 
(Er) and inhibitor molecules bind only to enzyme in the T state (E T ). 


Allosteric effectors shift the concentra- 
tions of the R and T forms of an allosteric 


▲ Figure 5.19 

The R conformation of phosphofructokinase-1 
from E. coli. The enzyme is a tetramer of 
identical chains, (a) Single subunit, shown 
as a ribbon. The products, fructose 1,6- 
b/sphosphate (yellow) and ADP (green), are 
bound in the active site. The allosteric acti- 
vator ADP (red) is bound in the regulatory 
site, (b) Tetramer. Two are blue, and two are 
purple. The products, fructose 1,6- 
b/'sphosphate (yellow) and ADP (green), are 
bound in the four active sites. The allosteric 
activator ADP (red) is bound in the four reg- 
ulatory sites, at the interface of the sub- 
units. [PDB 1PFK]. 

The relationship between the regula- 
tion of an individual enzyme and a 
pathway is discussed in Section 10.2B, 
where we encounter terms such as 
feedback inhibition and feedforward 

156 CHAPTER 5 Properties of Enzymes 

Figure 5.20 ► 

Role of cooperativity of binding in regulation. 

The activity of an allosteric enzyme with a 
sigmoidal binding curve can be altered 
markedly when either an activator or an in- 
hibitor is bound to the enzyme. Addition of 
an activator can lower the apparent K m rais- 
ing the activity at a given [S]. Conversely, 
addition of an inhibitor can raise the appar- 
ent K m producing less activity at a given [S]. 





; > 


E t 


These simplified examples illustrate the main property of allosteric effectors — they shift 
the steady- state concentrations of free Ej and E R . 

Figure 5.20 illustrates the regulatory role that cooperative binding can play. Addi- 
tion of an activator can shift the sigmoidal curve toward a hyperbolic shape, lowering 
the apparent K m (the concentration of substrate required for half- saturation) and rais- 
ing the activity at a given [S]. The addition of an inhibitor can raise the apparent K m of 
the enzyme and lower its activity at any particular concentration of substrate. 

The addition of S leads to an increase in the concentration of enzyme in the R con- 
formation. Conversely, the addition of inhibitor increases the proportion of the T 
species. Activator molecules bind preferentially to the R conformation leading to an 
increase in the R/T ratio. Note that this simplified scheme does not show that there are 
multiple interacting binding sites for both S and I. 

Some allosteric inhibitors are nonclassical competitive inhibitors (Figure 5.8). For 
example, Figure 5.20 describes an enzyme that has a higher apparent K m for its sub- 
strate in the presence of the allosteric inhibitor but an unaltered V max . Therefore, the 
allosteric modulator is a competitive inhibitor. 

Some regulatory enzymes exhibit noncompetitive inhibition patterns where bind- 
ing of a modulator at the regulatory site does not prevent substrate from binding but 
appears to distort the conformation of the active site sufficiently to decrease the activity 
of the enzyme. 

C. Two Theories of Allosteric Regulation 

Recall that most proteins are made up of two or more polypeptide chains (Section 4.8). 
Enzymes are typical proteins — most of them have multiple subunits. This complicates 
our understanding of regulation. There are two general models that explain the cooper- 
ative binding of ligands to multimeric proteins. Both models describe the cooperative 
transitions in simple quantitative terms. 

The concerted model, or symmetry model, was devised to explain the cooperative 
binding of identical ligands, such as substrates. It was first proposed in 1965 by 

5.9 Regulation of Enzyme Activity 


▲ Figure 5.21 

Two models for cooperativity of binding of substrate (S) to a tetrameric protein. A two-subunit protein is shown for simplicity. In all cases, the enzymati- 
cally active subunit (R) is colored green and the inactive conformation (T) is colored red. (a) In the simplified concerted model, both subunits are ei- 
ther in the R conformation or the T conformation. Substrate (S) can bind to subunits in either conformation but binding to T is assumed to be weaker 
than binding to R. Cooperativity is explained by postulating that when substrate binds to a subunit in the T conformation (red), it shifts the protein 
into a conformation where both subunits are in the R conformation, (b) In the sequential model, one subunit may be in the R conformation while an- 
other is in the T conformation. As in the concerted model, both conformations can bind substrate. Cooperativity is achieved by postulating that sub- 
strate binding causes the subunit to shift to the R conformation and that when one subunit has adopted the R conformation, the other one is more 
likely to bind substrate and undergo a conformation change (diagonal lines). 

Jacques Monod, Jeffries Wyman, and Jean-Pierre Changeux and it’s sometimes known 
as the MWC model. The concerted model assumes there is one substrate binding site 
on each subunit. According to the concerted model, the conformation of each subunit 
is constrained by its association with other subunits and when the protein changes 
conformation it retains its molecular symmetry (Figure 5.21a). Thus, there are two 
conformations in equilibrium, R and T. When a subunit is in the R conformation it 
has a high affinity for the substrate. Subunits in the T conformation have a low affin- 
ity for the substrate. The binding of substrate to one subunit shifts the equilibrium 
since it “locks” the other subunits in the R conformation making it more likely that 
the other subunits will bind substrate. This explains the cooperativity of substrate 

When the conformation of the protein changes, the affinity of its substrate binding 
sites also changes. The concerted model was extended to include the binding of al- 
losteric effectors and it can be simplified by assuming that the substrate binds only to 
the R conformation and the allosteric effectors bind preferentially to one of the confor- 
mations — inhibitors bind only to subunits in the T conformation and activators bind 
only to subunits in the R conformation. The concerted model is based on the observed 
structural symmetry of regulatory enzymes. It suggests that all subunits of a given pro- 
tein molecule have the same conformation, either all R or all T. 

When the enzyme shifts from one conformation to the other, all subunits change 
conformation in a concerted manner. Experimental data obtained with a number of en- 
zymes can be explained by this simple theory. For example, many of the properties of 
phosphofructokinase- 1 from E. coli fit the concerted theory. In most cases, however, the 
concerted theory does not adequately account for all of the observations concerning a 
particular enzyme. Their behavior is more complex than that suggested by this simple 
all-or-nothing model. 

The sequential model was first proposed by Daniel Koshland, George Nemethy, and 
David Filmer (KNF model). It is a more general model because it allows for both 
subunits to exist in two different conformations within the same multimeric protein. 
The specific induced- fit version or the model is based on the idea that a ligand may in- 
duce a change in the tertiary structure of each subunit to which it binds. This subunit-ligand 


CHAPTER 5 Properties of Enzymes 

complex may change the conformations of neighboring subunits to varying extents. 
Like the concerted model, the sequential model assumes that only one shape has a high 
affinity for the ligand but it differs from the concerted model in allowing for the exis- 
tence of both high- and low- affinity subunits in a multisubunit protein (Figure 5.21b). 

Hundreds of allosteric proteins have been studied and the majority show coopera- 
tive binding of substrates and/or effector molecules. It has proven to be very difficult to 
distinguish between the concerted and sequential models. Many proteins exhibit bind- 
ing behavior that can best be explained as a mixture of the all-or-nothing shift of the 
concerted model and the stepwise shift of the sequential model. 

D. Regulation by Covalent Modification 

▲ Figure 5.22 

Regulation of mammalian pyruvate dehydroge- 
nase. Pyruvate dehydrogenase, an intercon- 
vertible enzyme, is inactivated by 
phosphorylation catalyzed by pyruvate 
dehydrogenase kinase. It is reactivated by 
hydrolysis of its phosphoserine residue, 
catalyzed by an allosteric hydrolase called 
pyruvate dehydrogenase phosphatase. 

The activity of an enzyme can be modified by the covalent attachment and removal of 
groups on the polypeptide chain. Regulation by covalent modification is usually slower 
than the allosteric regulation described above. It’s important to note that the covalent 
modification of regulated enzymes must be reversible, otherwise it wouldn’t be a form 
of regulation. The modifications usually require additional modifying enzymes for acti- 
vation and inactivation. The activities of these modifying enzymes may themselves be 
allosterically regulated or regulated by covalent modification. Enzymes controlled by 
covalent modification are believed to generally undergo R v T transitions but they 
may be frozen in one conformation or the other by a covalent substitution. 

The most common type of covalent modification is phosphorylation of one or 
more specific serine residues, although in some cases threonine, tyrosine, or histidine 
residues are phosphorylated. An enzyme called a protein kinase catalyzes the transfer of 
the terminal phosphoryl group from ATP to the appropriate serine residue of the regu- 
lated enzyme. The phosphoserine of the regulated enzyme is hydrolyzed by the activity 
of a protein phosphatase, releasing phosphate and returning the enzyme to its dephos- 
phorylated state. Individual enzymes differ as to whether it is their phosphorylated or 
dephosphorylated forms that are active. 

The reactions involved in the regulation of mammalian pyruvate dehydrogenase by 
covalent modification are shown in Figure 5.22. Pyruvate dehydrogenase catalyzes a re- 
action that connects the pathway of glycolysis to the citric acid cycle. Phosphorylation 
of pyruvate dehydrogenase, catalyzed by the allosteric enzyme pyruvate dehydrogenase 
kinase, inactivates the dehydrogenase. The kinase can be activated by any of several 
metabolites. Phosphorylated pyruvate dehydrogenase is reactivated under different 
metabolic conditions by hydrolysis of its phosphoserine residue, catalyzed by pyruvate 
dehydrogenase phosphatase. 

5.10 Multienzyme Complexes and 
Multifunctional Enzymes 

In some cases, different enzymes that catalyze sequential reactions in the same pathway 
are bound together in a multienzyme complex. In other cases, different activities may be 
found on a single multifunctional polypeptide chain. The presence of multiple activities 
on a single polypeptide chain is usually the result of a gene fusion event. 

Some multienzyme complexes are quite stable. We will encounter several of these 
complexes in other chapters. In other multienzyme complexes the proteins may be 
associated more weakly (Section 4.9). Because these complexes dissociate easily it has 
been difficult to demonstrate their existence and importance. Attachment to mem- 
branes or cytoskeletal components is another way that enzymes may be associated. 

The metabolic advantages of multienzyme complexes and multifunctional en- 
zymes include the possibility of metabolite channeling. Channeling of reactants between 
active sites can occur when the product of one reaction is transferred directly to the 
next active site without entering the bulk solvent. This can vastly increase the rate of a 
reaction by decreasing transit times for intermediates between enzymes and by produc- 
ing local high concentrations of intermediates. Channeling can also protect chemically 
labile intermediates from degradation by the solvent. Metabolic channeling is one way 
in which enzymes can effectively couple separate reactions. 

Problems 159 

One of the best- characterized examples of channeling involves the enzyme trypto- 
phan synthase that catalyzes the last two steps in the biosynthesis of tryptophan (Sec- 
tion 17.3F). Tryptophan synthase has a tunnel that conducts a reactant between its two 
active sites. The structure of the enzyme not only prevents the loss of the reactant to the 
bulk solvent but also provides allosteric control to keep the reactions occurring at the 
two active sites in phase. 

Several other enzymes have two or three active sites connected by a molecular tun- 
nel. Another mechanism for metabolite channeling involves guiding the reactant along 
a path of basic amino acid side chains on the surface of coupled enzymes. The metabo- 
lites (most of which are negatively charged) are directed between active sites by the elec- 
trostatically positive surface path. The fatty acid synthase complex catalyzes a sequence 
of seven reactions required for the synthesis of fatty acids. The structure of this complex 
is described in Chapter 16 (Section 16.1). 

The search for enzyme complexes and the evaluation of their catalytic and regula- 
tory roles is an extremely active area of research. 

The regulation of pyruvate dehydroge- 
nase activity is explained in Section 
13.5. An example of a signal transduc- 
tion pathway involving covalent modifi- 
cation is described in Section 12.6. 


1. Enzymes, the catalysts of living organisms, are remarkable for 
their catalytic efficiency and their substrate and reaction speci- 
ficity. With few exceptions, enzymes are proteins or proteins plus 
cofactors. Enzymes are grouped into six classes (oxidoreductases, 
transferases, hydrolases, lyases, isomerases, and ligases) according 
to the nature of the reactions they catalyze. 

2. The kinetics of a chemical reaction can be described by a rate 

3. Enzymes and substrates form noncovalent enzyme-substrate 
complexes. Consequently, enzymatic reactions are characteris- 
tically first order with respect to enzyme concentration and 
typically show hyperbolic dependence on substrate concentra- 
tion. The hyperbola is described by the Michaelis-Menten 

4. Maximum velocity (Vmax) is reached when the substrate concen- 
tration is saturating. The Michaelis constant (K m ) is equal to the 
substrate concentration at half-maximal reaction velocity — that 
is, at half- saturation of E with S. 

5. The catalytic constant (fc cat ), or turnover number, for an enzyme 
is the maximum number of molecules of substrate that can be 
transformed into product per molecule of enzyme (or per active 
site) per second. The ratio k cat /K m is an apparent second-order 

rate constant that governs the reaction of an enzyme when the 
substrate is dilute and nonsaturating. k cat /K m provides a measure 
of the catalytic efficiency of an enzyme. 

6. K m and V max can be obtained from plots of initial velocity at a series 
of substrate concentrations and at a fixed enzyme concentration. 

7. Multisubstrate reactions may follow a sequential mechanism with 
binding and release events being ordered or random, or a ping- 
pong mechanism. 

8. Inhibitors decrease the rates of enzyme- catalyzed reactions. Re- 
versible inhibitors may be competitive (increasing the apparent 
value of K m without changing V max ), uncompetitive (appearing 
to decrease K m and V max proportionally), noncompetitive 
(appearing to decrease V max without changing K m ), or mixed. 
Irreversible enzyme inhibitors form covalent bonds with the 

9. Allosteric modulators bind to enzymes at a site other than the ac- 
tive site and alter enzyme activity. Two models, the concerted 
model and the sequential model, describe the cooperativity of al- 
losteric enzymes. Covalent modification, usually phosphorylation, 
of certain regulatory enzymes can also regulate enzyme activity. 

Multienzyme complexes and multifunctional enzymes are very 
common. They can channel metabolites between active sites. 


1. Initial velocities have been measured for the reaction of a-chy- 
motrypsin with tyrosine benzyl ester [S] at six different substrate 
concentrations. Use the data below to make a reasonable estimate 
of the V max and K m value for this substrate. 

mM[S] 0.00125 0.01 0.04 0.10 2.0 10 

(mM/min) 14 35 56 66 69 70 

2. Why is the k cat /K m value used to measure the catalytic proficiency 
of an enzyme? 

(a) What are the upper limits for k cat /K m values for enzymes? 

(b) Enzymes with k CSit /K m values approaching these upper limits 
are said to have reached “catalytic perfection.” Explain. 

3. Carbonic anhydrase (CA) has a 25,000-fold higher activity (fc cat = 
10 6 s -1 ) than orotidine monophosphate decarboxylase (OMPD) 
(fccat = 40 s -1 ). However, OMPD provides more than a 10 10 higher 
“rate acceleration” than CA (Table 5.2). Explain how this is possible. 

4. An enzyme that follows Michaelis-Menten kinetics has a K m of 
1 ^M. The initial velocity is 0.1 ^M min -1 at a substrate concen- 
tration of 100 jdM. What is the initial velocity when [S] is equal to 
(a) 1 mM, (b) 1 ^M, or (c) 2 ^M? 

5. Human immunodeficiency virus 1 (HIV-1) encodes a protease 
(M r 21,500) that is essential for the assembly and maturation of 
the virus. The protease catalyzes the hydrolysis of a heptapeptide 
substrate with a /c cat of 1000 s -1 and a K m of 0.075 M. 

160 CHAPTER 5 Properties of Enzymes 

(a) Calculate V max for substrate hydrolysis when HIV- 1 protease 
is present at 0.2 mg ml -1 . 

(b) When — C(0)NH — of the heptapeptide is replaced by 
— CH 2 NH — , the resulting derivative cannot be cleaved by 
HIV- 1 protease and acts as an inhibitor. Under the same ex- 
perimental conditions as in part (a), but in the presence of 
2.5 n M inhibitor, V max is 9.3 x 10 -3 M s -1 . What kind of inhi- 
bition is occurring? Is this type of inhibition expected for a 
molecule of this structure? 

6. Draw a graph of v 0 versus [S] for a typical enzyme reaction (a) in 
the absence of an inhibitor, (b) in the presence of a competitive 
inhibitor, and (c) in the presence of a noncompetitive inhibitor. 

7. Sulfonamides (sulfa drugs) such as sulfanilamide are antibacterial 
drugs that inhibit the enzyme dihydropteroate synthase (DS) that 
is required for the synthesis of folic acid in bacteria. There is no 
corresponding enzyme inhibition in animals because folic acid is 
a required vitamin and cannot be synthesized. If p aminobenzoic 
acid (PABA) is a substrate for DS, what type of inhibition can be 
predicted for the bacterial synthase enzyme in the presence of sul- 
fonamides? Draw a double reciprocal plot for this type of inhibi- 
tion with correctly labeled axes and identify the uninhibited and 
inhibited lines. 

O O 


Sulfonamides p- Aminobenzoic acid 

(R = H, sulfanilamide) 

8. (a) Fumarase is an enzyme in the citric acid cycle that catalyzes 
the conversion of fumarate to L-malate. Given the fumarate 
(substrate) concentrations and initial velocities below, 
construct a Lineweaver-Burk plot and determine the V max 

and K m values for the fumarase-catalyzed reaction. 

Fumarate (mM) 

Rate (mmol 1 1 min 









(b) Fumarase has a molecular weight of 194,000 and is composed of 
four identical subunits, each with an active site. If the enzyme 
concentration is 1 X 10~ 2 M for the experiment in part (a), 
calculate the k cat value for the reaction of fumarase with 
fumarate. Note : The units for k cat are reciprocal seconds (s -1 ). 

9. Covalent enzyme regulation plays an important role in the 
metabolism of muscle glycogen, an energy storage molecule. The 
active phosphorylated form of glycogen phosphorylase (GP) cat- 
alyzes the degradation of glycogen to glucose 1 -phosphate. Using 
pyruvate dehydrogenase as a model (Figure 5.23), fill in the boxes 
below for the activation and inactivation of muscle glycogen 

10 . Regulatory enzymes in metabolic pathways are often found at the 
first step that is unique to that pathway. How does regulation at 
this point improve metabolic efficiency? 

11. ATCase is a regulatory enzyme at the beginning of the pathway 
for the biosynthesis of pyrimidine nucleotides. ATCase exhibits 
positive cooperativity and is activated in vitro by ATP and inhib- 
ited by the pyrimidine nucleotide cytidine triphosphate (CTP). 
Both ATP and CTP affect the K m for the substrate aspartate but 
not V max . In the absence of ATP or CTP, the concentration of as- 
partate required for half-maximal velocity is about 5 mM at satu- 
rating concentrations of the second substrate, carbamoyl phos- 
phate. Draw a v 0 versus [aspartate] plot for ATCase, and indicate 
how CTP and ATP affect v 0 when [aspartate] = 5 mM. 

12. The cytochrome P450 family of monooxygenase enzymes are in- 
volved in the clearance of foreign compounds (including drugs) 
from our body. P450s are found in many tissues, including the 
liver, intestine, nasal tissues, and lung. For every drug that is ap- 
proved for human use the pharmaceutical company must investi- 
gate the metabolism of the drug by cytochrome P450. Many of the 
adverse drug-drug interactions known to occur are a result of inter- 
actions with the cytochrome P450 enzymes. A significant portion of 
drugs are metabolized by one of the P450 enzymes, P450 3A4. 
Human intestinal P450 3A4 is known to metabolize midazolam, a 
sedative, to a hydroxylated product, U-hydroxymidazolam. The ki- 
netic data given below are for the reaction catalyzed by P450 3A4. 

(a) Focusing on the first two columns, determine the K m and 
Umax for the enzyme using a Lineweaver-Burk plot. 

(b) Ketoconazole, an antifungal, is known to cause adverse 
drug-drug interactions when administered with midazolam. 
Using the data in the table, determine the type of inhibition 
that ketoconazole exerts on the P450-catalyzed hydroxyla- 
tion of midazolam. 

Rate of product 
formation in the 

Rate of product presence of 0.1 pM 
formation ketoconazole 

Midazolam(^M) (pmol 1 1 min -1 ) (pmol 1 1 min 1 ) 













[Adapted from Gibbs, M. A., Thummel, K. E., Shen, D. D., and 
Kunze, K. L. DrugMetab. Dispos. (1999). 27:180-187] 

Selected Readings 161 

13. Patients who are taking certain medications are warned by their 
physicians to avoid taking these medications with grapefruit 
juice, which contains many compounds including bergamottin. 
Cytochrome P450 3A4 is a monooxygenase that is known to me- 
tabolize drugs to their inactive forms. The following results were 
obtained when P450 3A4 activity was measured in the absence or 
presence of bergamottin. 

Bergamottin (^M) 

(a) What is the effect of adding bergamottin to the P450-cat- 
alyzed reaction? 

(b) Why could it be dangerous for a patient to take certain 
medications with grapefruit juice? 

[Adapted from Wen, Y. H., Sahi, J., Urda, E., Kalkarni, S., 
Rose, K., Zheng, X., Sinclair, J. F., Cai, H., Strom, S. C., and 
Kostrubsky, V. E. Drug Metab. Dispos. (2002). 30:977-984.] 

14 . Use the Michaelis-Menten equation (Equation 5.14) to 
demonstrate the following: 

(a) v 0 becomes independent of [S] when [S]»X m . 

(b) The reaction is first order with respect to S when [S] «K m . 

(c) [S] »K m when v 0 is one-half U max - 

Selected Readings 

Enzyme Catalysis 

Fersht, A. (1985). Enzyme Structure and Mecha- 
nism , 2nd ed. (New York: W. H. Freeman). 

Lewis, C. A., and Wolfenden, R. (2008). Uropor- 
phyrinogen decarboxylation as a benchmark for 
the catalytic proficiency of enzymes. Proc. Natl 
Acad. Sci. (USA). 105:17328-17333. 

Miller, B. G., and Wolfenden, R. (2002). Catalytic 
proficiency: the unusual case of OMP decarboxy- 
lase. Annu. Rev. Biochem. 71, 847-885. 

Sigman, D. S., and Boyer, P. D., eds. (1990-1992). 
The Enzymes , Vols. 19 and 20, 3rd ed. (San Diego: 
Academic Press). 

Webb, E. C., ed. (1992). Enzyme Nomenclature 
1992: Recommendations of the Nomenclature Com- 
mittee of the International Union of Biochemistry 
and Molecular Biology on the Nomenclature and 
Classification of Enzymes (San Diego; Academic 

Enzyme Kinetics and Inhibition 

Bugg, C. E., Carson, W. M., and Montgomery, J. A. 
(1993). Drugs by design. Sci. Am. 269(6):92-98. 

Chandrasekhar, S. (2002). Thermodynamic analy- 
sis of enzyme catalysed reactions: new insights 
into the Michaelis-Menten equation. Res. Cehm. 
Intermed. 28:265-2 75. 

Cleland, W. W. (1970). Steady State Kinetics. The 
Enzymes , Vol. 2, 3rd ed., P. D. Boyer, ed. (New York: 
Academic Press), pp. 1-65. 

Cornish- Bowden, A. (1999). Enzyme kinetics from 
a metabolic perspective. Biochem. Soc. Trans. 

Northrop, D. B. (1998). On the meaning of K m 
and V/K in enzyme Kinetics. /. Chem. Ed. 

Radzicka, A., and Wolfenden, R. (1995). A profi- 
cient enzyme. Science 267:90-93. 

Segel, I. H. (1975) Enzyme Kinetics: Behavior and 
Analysis of Rapid Equilibrium and Steady State 
Enzyme Systems (New York: Wiley-Interscience). 

Regulated Enzymes 

Ackers, G. K., Doyle, M. L., Myers, D., and Daugh- 
erty, M. A. (1992). Molecular code for cooperativ- 
ity in hemoglobin. Science 255:54-63. 

Barford, D. (1991). Molecular mechanisms for the 
control of enzymic activity by protein phosphory- 
lation. Biochim. Biophys. Acta 1133:55-62. 

Hilser, V. J. (2010). An ensemble view of allostery. 
Science 327:653-654. 

Hurley, J. H., Dean, A. M., Sohl, J. L., Koshland, D. 
E., Jr., and Stroud, R. M. (1990). Regulation of an 
enzyme by phosphorylation at the active site. 
Science 249:1012-1016. 

Schirmer, T., and Evans, P. R. (1990). Structural 
basis of the allosteric behavior of phosphofructok- 
inase. Nature 343:140-145. 

Metabolite Channeling 

Pan, P., Woehl, E., and Dunn, M. F. (1997). Protein 
architecture, dynamics and allostery in tryptophan 
synthase channeling. Trends Biochem. Sci. 


Velot, C., Mixon, M. B., Teige, M., and Srere, P. A. 
(1997). Model of a quinary structure between 
Krebs TCA cycle enzymes: a model for the 
metabolon. Biochemistry 36:14271-14276. 


Mechanisms of Enzymes 

T he previous chapter described some general properties of enzymes with an 
emphasis on enzyme kinetics. In this chapter, we see how enzymes catalyze reactions 
by studying the molecular details of catalyzed reactions. Individual enzyme 
mechanisms have been deduced by a variety of methods including kinetic experiments, 
protein structural studies, and studies of nonenzymatic model reactions. The results of 
such studies show that the extraordinary catalytic ability of enzymes results from simple 
physical and chemical properties, especially the binding and proper positioning of reac- 
tants in the active sites of enzymes. Chemistry, physics, and biochemistry have combined 
to take much of the mystery out of enzymes and recombinant DNA technology now 
allows us to test the theories proposed by enzyme chemists. Observations for which 
there were no explanations just a half-century ago are now thoroughly understood. 

The mechanisms of many enzymes are well established and they give us a general pic- 
ture of how enzymes function as catalysts. We begin this chapter with a review of simple 
chemical mechanisms, followed by a brief discussion of catalysis. We then examine the 
major modes of enzymatic catalysis: acid-base and covalent catalysis (classified as chemi- 
cal effects) and substrate binding and transition state stabilization (classified as binding 
effects). We end the chapter with some specific examples of enzyme mechanisms. 

I think that enzymes are molecules 
that are complementary in structure 
to the activated complexes of the 
reactions that they catalyze. 

—Linus Pauling (1948) 

6.1 The Terminology of Mechanistic Chemistry 

The mechanism of a reaction is a detailed description of the molecular, atomic, and 
even subatomic events that occur during the reaction. Reactants, products, and any in- 
termediates must be identified. A number of laboratory techniques are used to deter- 
mine the mechanism of a reaction. For example, the use of isotopically labeled reactants 
can trace the path of individual atoms and kinetic techniques can measure the changes in 
chemical bonds of a reactant or solvent during the reaction. Study of the stereochemical 
changes that occur during the reaction can give a three-dimensional view of the process. 
For any proposed enzyme mechanism, the mechanistic information about the reactants 
and intermediates must be coordinated with the three-dimensional structure of the en- 
zyme. This is an important part of understanding structure-function relationships — 
one of the main themes in biochemistry. 

Top: A step from the mechanism of the triose phosphate isomerase reaction. 


6.1 The Terminology of Mechanistic Chemistry 


Enzymatic mechanisms are described using the same symbolism developed in or- 
ganic chemistry to represent the breaking and forming of chemical bonds. The move- 
ment of electrons is the key to understanding chemical (and enzymatic) reactions. We 
will review chemical mechanisms in this section and in the following sections we will 
discuss catalysis and present several specific enzyme mechanisms. This discussion 
should provide sufficient background for you to understand all the enzyme -catalyzed 
reactions presented in this book. 

A. Nucleophilic Substitutions 

Many chemical reactions have ionic substrate, intermediates, or products. There are two 
types of ionic molecules: one species is electron rich, or nucleophilic, and the other species 
is electron poor, or electrophilic (Section 2.6). A nucleophile has a negative charge or an 
unshared electron pair. We usually think of the nucleophile as attacking the electrophile 
and call the mechanism a nucleophilic attack or a nucleophilic substitution. In mechanistic 
chemistry, the movement of a pair of electrons is represented by a curved arrow pointing 
from the available electrons of the nucleophile to the electrophilic center. These “electron 
pushing” diagrams depict the breaking of an existing covalent bond or the formation of a 
new covalent bond. The reaction mechanism usually involves an intermediate. 

Many biochemical reactions are group transfer reactions where a group is moved 
from one molecule to another. Many of these reactions involve a charged intermediate. 
The transfer of an acyl group, for example, can be written as the general mechanism 


x 0 

( 6 . 1 ) 

The nucleophile Y® attacks the carbonyl carbon (i.e., adds to the carbonyl carbon atom) 
to form a tetrahedral addition intermediate from which is eliminated. is called 
the leaving group — the group displaced by the attacking nucleophile. This is an example 

of a nucleophilic substitution reaction. 

Another type of nucleophilic substitution involves direct displacement. In this 
mechanism, the attacking group, or molecule, adds to the face of the central atom op- 
posite the leaving group to form a transition state having five groups associated with the 
central atom. This transition state is unstable. It has a structure between that of the re- 
actant and that of the product. (Transition states are shown in square brackets to 
identify them as unstable, transient entities.) 

i \j 

L r 3 J 

Transition state 

R 2 R, 

\ / 


/ \ 

X Rq 

+ Y 


( 6 . 2 ) 

Note that both types of nucleophilic substitution mechanisms involve a transitory 
state. In the first type (Reaction 6.1), the reaction proceeds in a stepwise manner form- 
ing an intermediate molecule that may be stable enough to be detected. In the second 
type of mechanism (Reaction 6.2), the addition of the attacking nucleophile and the 
displacement of the leaving group occur simultaneously. The transition state is not a 
stable intermediate. 

B. Cleavage Reactions 

We will also encounter cleavage reactions. Covalent bonds can be cleaved in two ways: ei- 
ther both electrons can stay with one atom or one electron can remain with each atom. 

Transition states are discussed further 
in Section 6.2. 


CHAPTER 6 Mechanisms of Enzymes 

The two electrons will stay with one atom in most reactions so that an ionic intermediate 
and a leaving group are formed. For example, cleavage of a C — H bond almost always 
produces two ions. If the carbon atom retains both electrons then the carbon- containing 
compound becomes a carbanion and the other product is a proton. 

R 3 — c— H > R 3 — O e + H© 

Carbanion Proton (6-3) 

If the carbon atom loses both electrons, the carbon-containing compound becomes a 
cationic ion called a carbocation and the hydride ion carries a pair of electrons. 

R 3 — c — H ■* R 3 — C© + H© 

Carbocation Hydride ^ 

In the second, less common, type of bond cleavage, one electron remains with each 
product to form two free radicals that are usually very unstable. (A free radical, or radi- 
cal, is a molecule or atom with an unpaired electron.) 

RtO — OR 2 > RiO + -OR 2 (6.5) 

Loss of Electrons = Oxidation (LEO) 
Gain of Electrons = Reduction (GER) 

Remember the phrase: LEO (the lion) 
says GER 

Oxidation is Loss (OIL) 

Reduction is Gain (RIG) 

Remember the phrase: OIL RIG 

C. Oxidation-Reduction Reactions 

Oxidation-reduction reactions are central to the supply of biological energy. In an 
oxidation-reduction (redox) reaction, electrons from one molecule are transferred to 
another. The terminology here can be a bit confusing so it’s important to master the 
meaning of the words oxidation and reduction — they will come up repeatedly in the rest 
of the book. Oxidation is the loss of electrons: a substance that is oxidized will have fewer 
electrons when the reaction is complete. Reduction is the gain of electrons: a substance 
that gains electrons in a reaction is reduced. Oxidation and reduction reactions always 
occur together. One substrate is oxidized and the other is reduced. An oxidizing agent is 
a substance that causes an oxidation — it takes electrons from the substrate that is oxi- 
dized. Thus, oxidizing agents gain electrons (i.e., they are reduced). A reducing agent is 
a substance that donates electrons (and is oxidized in the process). 

Oxidations can take several forms, such as removal of hydrogen (dehydrogena- 
tion), addition of oxygen, or removal of electrons. Dehydrogenation is the most com- 
mon form of biological oxidation. Recall that oxidoreductases (enzymes that catalyze 
oxidation-reduction reactions) represent a large class of enzymes and dehydrogenases 
(enzymes that catalyze removal of hydrogen) are a major subclass of oxidoreductases 
(Section 5.1). 

Most dehydrogenations occur by C — H bond cleavage producing a hydride ion 
(H®). The substrate is oxidized because it loses the electrons associated with the 
hydride ion. Such reactions will be accompanied by a corresponding reduction where 
another substrate gains electrons by reacting with the hydride ion. The dehydrogena- 
tion of lactate (Equation 5.1) is an example of the removal of hydrogen. In this case, the 
oxidation of lactate is coupled to the reduction of the coenzyme NAD®. The role of 
cofactors in oxidation-reduction reactions will be discussed in the next chapter 
(Section 7.3) and the free energy of these reactions is described in Section 10.9. 

6.2 Catalysts Stabilize Transition States 

In order to understand catalysis it’s necessary to appreciate the importance of transition 
states and intermediates in chemical reactions. The rate of a chemical reaction depends 
on how often reacting molecules collide in such a way that a reaction is favored. The col- 
liding substances must be in the correct orientation and must possess sufficient energy to 
approach the physical configuration of the atoms and bonds of the final product. 

As mentioned above, the transition state is an unstable arrangement of atoms in 
which chemical bonds are in the process of being formed or broken. Transition states 

6.2 Catalysts Stabilize Transition States 165 

◄ Figure 6.1 

Energy diagram for a single-step reaction. The 

upper arrow shows the activation energy for 
the forward reaction. Molecules of substrate 
that have more free energy than the activa- 
tion energy pass over the activation barrier 
and become molecules of product. For reac- 
tions with a high activation barrier, energy in 
the form of heat must be provided in order 
for the reaction to proceed. 

Course of the reaction > 

(Reaction coordinate) 

have extremely short lifetimes of about 10 -14 to 10 -13 second, the time of one bond vi- 
bration. Although they are very difficult to detect, their structures can be predicted. The 
energy required to reach the transition state from the ground state of the reactants is called 
the activation energy of the reaction and is often referred to as the activation barrier. 

The progress of a reaction can be represented by an energy diagram, or energy pro- 
file. Figure 6.1 is an example that shows the conversion of a substrate (reactant) to a 
product in a single step. The y axis shows the free energies of the reacting species. The 
x axis, called the reaction coordinate , measures the progress of the reaction, beginning 
with the substrate on the left and proceeding to the product on the right. This axis is not 
time but rather the progress of bond breaking and bond formation of a particular mol- 
ecule. The transition state occurs at the peak of the activation barrier — this is the energy 
level that must be exceeded for the reaction to proceed. The lower the barrier the more 
stable the transition state and the more often the reaction proceeds. 

Intermediates, unlike transition states, can be sufficiently stable to be detected or iso- 
lated. When there is an intermediate in a reaction, the energy diagram has a trough that 
represents the free energy of the intermediate as shown in Figure 6.2. This reaction has two 
transition states, one preceding formation of the intermediate and one preceding its con- 
version to product. The slowest step, the rate- determining or rate-limiting step, is the step 
with the highest energy transition state. In Figure 6.2, the rate-determining step is the for- 
mation of the intermediate. The intermediate is metastable because relatively little energy is 
required for the intermediate either to continue to product or to revert to the original reac- 
tant. Proposed intermediates that are too short-lived to be isolated or detected are often en- 
closed in square brackets like transition states, which they presumably closely resemble. 

Catalysts create reaction pathways that have lower activation energies than those of 
uncatalyzed reactions. Catalysts participate directly in reactions by stabilizing the tran- 
sition states along the reaction pathways. Enzymes are catalysts that accelerate reactions 
by lowering the overall activation energy. They achieve rate enhancement by providing 
a multistep pathway (with one or several intermediates) in which each of the steps has 
lower activation energy than the corresponding stages in the nonenzymatic reaction. 

The first step in an enzymatic reaction is the formation of a noncovalent 
enzyme-substrate complex, ES. In a reaction between A and B, formation of the EAB 
complex collects and positions the reactants making the probability of reaction much 
higher for the enzyme -catalyzed reaction than for the uncatalyzed reaction. Figures 6.3a 
and 6.3b show a hypothetical case in which substrate binding is the only mode of 
catalysis by an enzyme. In this example, the activation energy is lowered by bringing the 
reactants together in the substrate binding site. Correct substrate binding accounts for a 
large part of the catalytic power of enzymes. 

The active sites of enzymes bind substrates and products. They also bind transition 
states. In fact, transition states are likely to bind to active sites much more tightly than 


Transition states are unstable molecules 
with free energies higher than either the 
substrate or the product. 

The meaning of activation energy is 
described in Section 1.4D. 

▲ Figure 6.2 

Energy diagram for a reaction with an interme- 
diate. The intermediate occurs in the trough 
between the two transition states. The rate- 
determining step in the forward direction is 
formation of the first transition state, the 
step with the higher energy transition state. 
S represents the substrate, and P represents 
the product. 

166 CHAPTER 6 Mechanisms of Enzymes 

(a) Uncatalyzed reaction (b) Effect of reactants being bound 

by enzyme 

(c) Effect of reactants and transition 
state being bound by enzyme 

▲ Figure 6.3 

Enzymatic catalysis of the reaction A + B — * A — B. (a) Energy diagram for an uncatalyzed reaction, (b) Effect of reactant binding. Collection of the two 
reactants in the EAB complex properly positions them for reaction, makes formation of the transition state more frequent, and hence lowers the 
activation energy, (c) Effect of transition-state stabilization. An enzyme binds the transition state more tightly than it binds substrates, further lower- 
ing the activation energy. Thus, an enzymatic reaction has a much lower activation energy than an uncatalyzed reaction. (The breaks in the reaction 
curves indicate that the enzymes provide multistep pathways.) 

substrates do. The extra binding interactions stabilize the transition state, further lowering 
the activation energy (Figure 6.3c). We will see that the binding of substrates followed by 
the binding of transition states provides the greatest rate acceleration in enzyme catalysis. 

We return to binding phenomena later in this chapter after we examine the chemi- 
cal processes that underlie enzyme function. (Note that enzyme -catalyzed reactions are 
usually reversible. The same principles apply to the reverse reaction. The activation en- 
ergy is lowered by binding the “products” and stabilizing the transition state.) 

In addition to reactive amino acid 
residues, there may be metal ions or 
coenzymes in the active site. The role 
of these cofactors in enzyme catalysis 
is described in Chapter 7. 

6.3 Chemical Modes of Enzymatic Catalysis 

The formation of an ES complex places reactants in proximity to reactive amino acid 
residues in the enzyme active site. Ionizable side chains participate in two kinds of 
chemical catalysis; acid-base catalysis and covalent catalysis. These are the two major 
chemical modes of catalysis. 

A. Polar Amino Acid Residues in Active Sites 

The active site cavity of an enzyme is generally lined with hydrophobic amino acid 
residues. However, a few polar, ionizable residues (and a few molecules of water) may 
also be present in the active site. Polar amino acid residues (or sometimes coenzymes) 
undergo chemical changes during enzymatic catalysis. These residues make up much of 
the catalytic center of the enzyme. 

Table 6.1 lists the ionizable residues found in the active sites of enzymes. Histidine, 
which has a p K a of about 6 to 7 in proteins, is often an acceptor or a donor of protons. 
Aspartate, glutamate, and occasionally lysine can also participate in proton transfer. 
Certain amino acids, such as serine and cysteine, are commonly involved in group- 
transfer reactions. At neutral pH, aspartate and glutamate usually have negative charges, 
and lysine and arginine have positive charges. These anions and cations can serve as 
sites for electrostatic binding of oppositely charged groups on substrates. 

6.3 Chemical Modes of Enzymatic Catalysis 167 


It is possible to test the functions of the amino acid side 
chains of an enzyme using the technique of site-directed mu- 
tagenesis (see Section 23.10). This technique has had a huge 
impact on our understanding of structure-function relation- 
ships of enzymes. 

In site-directed mutagenesis, a desired mutation is engi- 
neered directly into a gene by synthesizing an oligonucleotide 
that contains the mutation flanked by sequences identical to 
the target gene. When this oligonucleotide is used as a primer 
for DNA replication in vitro , the new copy of the gene contains 
the desired mutation. Since alterations can be made at any 
position in a gene, specific changes in proteins can be engineered 
allowing direct testing of hypotheses about the functional 
role of key amino acid residues. Site-directed mutagenesis is 

commonly used to introduce single codon mutations into 
genes, resulting in single amino acid substitutions. 

The mutated gene can be introduced into bacterial cells 
where modified enzymes are synthesized from the gene. The 
structure and activity of the mutant protein can then be ana- 
lyzed to see the effect of changing an individual amino acid. 





i jmM 

'ff iPfjr 

i A ' 

'v, *.• 


f NO 



Michael Smith h Nobel Laureate 



Eric Darner e- Carol i no Astell 

▲ Michael Smith (1932-2000), received 
the Nobel Prize in Chemistry in 1993 for 
inventing site-directed mutagenesis. 

Single-stranded vector 
containing sequence to 
be altered 






◄ Oligonucleotide-directed, site-specific 
mutagenesis. A synthetic oligonucleotide 
containing the desired change (3 bp) is 
annealed to the single-stranded vector 
containing the sequence to be altered. The 
synthetic oligonucleotide serves as a primer 
for the synthesis of a complementary strand. 
The double-stranded, circular heteroduplex 
is transformed into E. coli cells where repli- 
cation produces mutant and wild-type DNA 

Transform cells 



Wild type 

168 CHAPTER 6 Mechanisms of Enzymes 

Table 6.1 Catalytic functions of reactive groups of ionizable amino acids 

Amino acid 



Net charge 
at pH 7 

Principal functions 




Cation binding; proton transfer 




Cation binding; proton transfer 



Near 0 

Proton transfer 


— CH 2 SH 

Near 0 

Covalent binding of acyl groups 




Hydrogen bonding to ligands 



+ 1 

Anion binding; proton transfer 



+ 1 

Anion binding 


— CH 2 OH 


Covalent binding of acyl groups 

Table 6.2 Typical p K a values of ionizable 
groups of amino acids in proteins 



Terminal a-carboxyl 


Side-chain carboxyl 




Terminal a-amino 












Table 6.3 Frequency distribution of 

catalytic residues in enzymes 

% of catalytic 

% of all 































The piC a values of the ionizable groups of amino acid residues in proteins may dif- 
fer from the values of the same groups in free amino acids (Section 3.4). Table 6.2 lists 
the typical p K a values of ionizable groups of amino acid residues in proteins. Compare 
these ranges to the exact values for free amino acids in Table 3.2. A given ionizable 
group can have different p K a values within a protein because of differing microenviron- 
ments. These differences are usually small but can be significant. 

Occasionally, the side chain of a catalytic amino acid residue exhibits a p K a quite 
different from the one shown in Table 6.2. Bearing in mind that p K a values may be per- 
turbed, one can test whether particular amino acids participate in a reaction by exam- 
ining the effect of pH on the reaction rate. If the change in rate correlates with the p K a 
of a certain ionic amino acid (Section 6. 3D), a residue of that amino acid may take 
part in catalysis. 

Only a small number of amino acid residues participate directly in catalyzing reac- 
tions. Most residues contribute in an indirect way by helping to maintain the correct 
three-dimensional structure of a protein. As we saw in Chapter 4, the majority of amino 
acid residues are not evolutionarily conserved. 

In vitro mutagenesis studies of enzymes have confirmed that most amino acid sub- 
stitutions have little effect on enzyme activity. Nevertheless, every enzyme has a few key 
residues that are absolutely essential for catalysis. Some of these residues are directly in- 
volved in the catalytic mechanism, often by acting as an acid or base catalyst or a nucle- 
ophile. Other residues act indirectly to assist or enhance the role of a key residue. Other 
roles for key catalytic residues include substrate binding, stabilization of the transition 
state, and interacting with essential cofactors. 

Enzymes usually have between two and six key catalytic residues. The top ten cat- 
alytic residues are listed in Table 6.3. The charged residues, His, Asp, Arg, Glu, and Lys 
account for almost two-thirds of all catalytic residues. This makes sense since charged 
side chains are more likely to act as acids, bases, and nucleophiles. They are also more 
likely to play a role in binding substrates or transition states. The number one catalytic 
residue is histidine. Histidine is 6 times more likely to be involved in catalysis than its 
abundance in proteins would suggest. 

B. Acid-Base Catalysis 

In acid-base catalysis, the acceleration of a reaction is achieved by catalytic transfer of a 
proton. Acid-base catalysis is the most common form of catalysis in organic chemistry 
and it’s also common in enzymatic reactions. Enzymes that employ acid-base catalysis 
rely on amino acid side chains that can donate and accept protons under the nearly neu- 
tral pH conditions of cells. This type of acid-base catalysis, involving proton-transferring 
agents, is termed general acid-base catalysis. (Catalysis by H® or OH® is termed specific 
acid or specific base catalysis.) In effect, the active sites of these enzymes provide the bio- 
logical equivalent of a solution of acid or base. 

It is convenient to use B: to represent a base, or proton acceptor, and BH® to repre- 
sent its conjugate acid, a proton donor. (This acid-base pair can also be written as 

6.3 Chemical Modes of Enzymatic Catalysis 


HA/A©.) a proton acceptor can assist reactions in two ways: (1) it can cleave 
O — H, N — H, or even some C — H bonds by removing a proton 

S' 'A • © 

— X^-pH :B < > — H — B (6.6) 

and (2) the general base B: can participate in the cleavage of other bonds involving car- 
bon, such as a C — N bond, by generating the equivalent of OH© in neutral solution 
through removal of a proton from a molecule of water. 



— C — N 


H. © 

C B J 



— c — OH + 




( 6 . 7 ) 

The general acid BH© can also assist in bond cleavage. A covalent bond may break 
more easily if one of its atoms is protonated. For example, 

R © + OH© R-OH 



H © 

R — onf R @ + h 2 o 

( 6 . 8 ) 

BH© catalyzes bond cleavage by donating a proton to an atom (such as the oxygen of 
R — OH in Equation 6.8), thereby making bonds to that atom more labile. In all reac- 
tions involving BH© the reverse reaction is catalyzed by B:, and vice versa. 

Histidine is an ideal group for proton transfer at neutral pH values because the 
imidazole/imidazolium of the side chain has a p X a of about 6 to 7 in most proteins. We 
have seen that histidine is a common catalytic residue. In the following sections, we will 
examine some specific roles of histidine side chains. 


In acid-base catalysis, the reaction 
requires specific amino acid side chains 
that can donate and accept protons. 

C. Covalent Catalysis 

In covalent catalysis, a substrate is bound covalently to the enzyme to form a reactive in- 
termediate. The reacting side chain of the enzyme can be either a nucleophile or an 
electrophile. Nucleophilic catalysis is more common. In the second step of the reaction, 
a portion of the substrate is transferred from the intermediate to a second substrate. For 
example, the group X can be transferred from molecule A — X to molecule B in the fol- 
lowing two steps via the covalent ES complex X — E: 

A— X + E X — E + A 

( 6 . 9 ) 


X — E + B B — X + E (6.10) 

This is a common mechanism for coupling two different reactions in biochemistry. 
Recall that the ability to couple reactions is one of the important properties of enzymes 
(Chapter 5; “Introduction”). Transferases, one of the six classes of enzymes (Section 5.1), 
catalyze group -transfer reactions in this manner and hydrolases catalyze a special kind 
of group-transfer reaction where water is the acceptor. Transferases and hydrolases 
together make up more than half of known enzymes. 

The reaction catalyzed by bacterial sucrose phosphorylase is an example of group 
transfer by covalent catalysis. (Sucrose is composed of one glucose residue and one 
fructose residue.) 

Sucrose + Pj Glucose 1 -phosphate + Fructose 

( 6 . 11 ) 

170 CHAPTER 6 Mechanisms of Enzymes 

Figure 6.4 ► 

Covalent catalysis. The enzyme A/-acetyl- 
D-neuraminic acid lyase from Escherichia 
coli catalyzes the condensation of pyruvate 
and A/-acetyl-D-mannosamine to form 
A/-acetyl-D-neuraminic acid (see Section 
8.7C). One of the intermediates in the reac- 
tion is a Schiff base (see Fig. 5.15) between 
pyruvate (black carbon atoms) and a lysine 
reside. The intermediate is stabilized by 
hydrogen bonds with other amino acid side 
chains. [PDB 2WKJ] 


In covalent catalysis mechanisms, the 
enzyme participates directly in the 
reaction. It reacts with a substrate and 
an intermediate containing the enzyme is 
produced. The reaction is not complete 
until free enzyme is regenerated. 

23456789 10 11 

The first chemical step in the reaction is formation of a covalent glucosyl-enzyme inter- 
mediate. In this case, sucrose is equivalent to A — X and glucose is equivalent to X in 
Reaction 6.9. 

Sucrose + Enzyme G I ucosyl- Enzyme + Fructose (6.12) 

The covalent ES intermediate can donate the glucose unit either to another mole- 
cule of fructose, in the reverse of Reaction 6.12, or to phosphate (which is equivalent to 
B in Reaction 6.10). 

Glucosyl-Enzyme + ^ = - Glucose 1 -phosphate + Enzyme (6.13) 

Proof that an enzyme mechanism relies on covalent catalysis often requires the iso- 
lation or detection of an intermediate and demonstration that it is sufficiently reactive. 
In some cases, the covalently bound intermediate is seen in the crystal structure of an 
enzyme, and this is direct proof of covalent catalysis (Figure 6.4 ). 

D. pH Affects Enzymatic Rates 

The effect of pH on the reaction rate of an enzyme can suggest which ionizable amino 
acid residues are in its active site. Sensitivity to pH usually reflects an alteration in the 
ionization state of one or more residues involved in catalysis, although occasionally substrate 
binding is affected. A plot of reaction velocity versus pH most often yields a bell-shaped 
curve provided the enzyme is not denatured when the pH is altered. 

A good example is the pH versus rate profile for papain, a protease isolated from 
papaya fruit (Figure 6.5). The bell- shaped pH profile can be explained by assuming that 
the ascending portion of the curve represents the deprotonation of an active-site amino 
acid residue (B) and the descending portion represents the deprotonation of a second 
active-site amino acid residue (A). The two inflection points approximate the pX a values of 
the two ionizable residues. A simple bell-shaped curve is the result of two overlapping 

◄ Figure 6.5 

pH vs rate profile for papain. The left and right segments of the bell-shaped curve represent the titra- 
tions of the side chains of active-site amino acids. The inflection point at pH 4.2 reflects the p K a of 
Cys-25, and the inflection point at pH 8.2 reflects the p K a of His-159. The enzyme is active only 
when these ionic groups are present as the thiolate-imidazolium ion pair. 

6.4 Diffusion-Controlled Reactions 171 

titrations. The side chain of A (R A ) must be protonated for activity and the side chain of 
B (R b ) must be unprotonated. 

H® H® H® 

Ra Rb 




Ra Rb 

n — 



n — 



1 1 

-c a -c a - 



n — 



n — 







At the pH optimum, midway between the two pK a values, the greatest number of 
enzyme molecules is in the active form with residue A protonated. Not all pH profiles 
are bell-shaped. A pH profile is a sigmoidal curve if only one ionizable amino acid 
residue participates in catalysis and it can have a more complicated shape if more than 
two ionizable groups participate. Enzymes are routinely assayed near their optimal pH, 
which is maintained using appropriate buffers. 

The pH versus rate graph for papain has inflection points at pH 4.2 and pH 8.2, 
suggesting that the activity of papain depends on two active-site amino acid residues 
with p K a values of about 4 and 8. These ionizable residues are a nucleophilic cysteine 
(Cys-25) and a proton-donating imidazolium group of histidine (His- 159) (Figure 6.6). 
The side chain of cysteine normally has a p K a value of 8 to 9.5 but in the active site of 
papain the piC a of Cys-25 is greatly perturbed to 3.4. The p K a of the His- 159 residue is 
perturbed to 8.3. The inflection points on the pH profile do not correspond exactly to the 
piC a values of Cys-25 and His- 159 because the ionization of additional groups contributes 
slightly to the overall shape of the curve. Three ionic forms of the catalytic center of papain 
are shown in Figure 6.7. The enzyme is active only when the thiolate group and the im- 
idazolium group form an ion pair (as in the upper tautomer of the middle pair). 

▲ Figure 6.6 Ionizable residues in papain. 

Model of papain, showing bal l-and-stick 
models of the active-site histidine and 
cysteine side chain. The imidazole nitrogen 
atoms are blue, and the sulfur atom is 

6.4 Diffusion-Controlled Reactions 

A few enzymes catalyze reactions at rates approaching the upper physical limit of reac- 
tions in solution. This theoretical upper limit is the rate of diffusion of reactants into 
the active site. A reaction that occurs with every collision between reactant molecules is 
termed a diffusion controlled reaction or a diffusion-limited reaction. Under physiological 
conditions the diffusion- controlled rate is about 10 8 to 10 9 M -1 s _1 . Compare this theo- 
retical maximum to the apparent second- order rate constants (k cat /K m ) for five very fast 
enzymes listed in Table 6.4. 

The binding of a substrate to an enzyme is a rapid reaction. If the rest of the reac- 
tion is simple and fast, the binding step may be the rate-determining step and the over- 
all rate of the reaction may approach the upper limit for catalysis. Only a few types of 
chemical reactions can proceed this quickly. These include association reactions, some 
proton transfers, and electron transfers. The reactions catalyzed by all the enzymes 
listed in Table 6.4 are so simple that the rate- determining steps are roughly as fast as 

Table 6.4 Enzymes with second-order rate constants near the upper limit 



*cat/Km(M 1 S V 


h 2 o 2 

4 X 10 7 



2 X 10 8 

Triose phosphate isomerase 

D-Glyceraldehyde 3-phosphate 

4 X 10 8 



10 9 

Superoxide dismutase 


2 X 10 9 

*The ratio k cat /K m is the apparent second-order rate constant for the enzyme-catalyzed reaction E + S — » E + P. 
For these enzymes, the formation of the ES complex can be the slowest step. 

172 CHAPTER 6 Mechanisms of Enzymes 



pK a = 3.4 

binding of substrates to the enzymes. They catalyze diffusion- controlled reactions. We will 
now look at two of these enzymes in detail: triose phosphate isomerase and superoxide 

A. Triose Phosphate Isomerase 

Triose phosphate isomerase catalyzes the rapid interconversion of dihydroxyacetone 
phosphate (DHAP) and glycer aldehyde 3 -phosphate (G3P) in the glycolysis and gluco- 
neogenesis pathways (Chapters 11 and 12). 




~T ch 2 

izC \ f=( 

s — H — :N^NH 

H ©^ 

p/C a = 8.3 


1 CH 2 OH 

2 C=0 

3 ch 2 opo 3 ® 




phosphate (DHAP) 

H O 


H — C — OH 








The reaction proceeds by shifting protons from the carbon atom 1 of DHAP to the 
carbon atom 2 (Figure 6.8). Triose phosphate isomerase has two ionizable active-site 
residues: glutamate that acts as a general acid-base catalyst, and histidine that shuttles a 
proton between oxygen atoms of an enzyme-bound intermediate. When dihydroxyace- 
tone phosphate (DHAP) binds, the carbonyl oxygen forms a hydrogen bond with the 
imidazole group of His-95. The carboxylate group of Glu-165 removes a proton from 
C-l of the substrate to form an enoldiolate transition state (Figure 6.8, top). The tran- 
sition-state molecule is rapidly converted to a stable enediol intermediate (middle, 
Figure 6.8). This intermediate is then converted via a second enediolate transition state 
to D-glyceraldehyde 3-phosphate (G3P). 

In this reaction, the proton-donating form of histidine appears to be the neutral 
species and the proton-accepting species appears to be the imidazolate. The hydrogen 
bonds formed between histidine and the intermediates in this mechanism appear to be 
unusually strong. 

▲ Figure 6.7 The activity of papain depends 
on two ionizable residues, histidine (His-159) 
and cysteine (Cys-25), in the active site. Three 
ionic forms of these residues are shown. 

Only the upper tautomer of the middle pair 
is active. 



NH — CH £ v/vnyvo 


CH 2 





-NH — CH — C 

ch 2 


:N'P/N: Imidazolate 

The imidazolate form of a histidine residue is unusual; the triose phosphate isomerase 
mechanism was the first enzymatic mechanism in which this form was implicated. 

The enediol intermediate is stable and in order to prevent it from diffusing out of 
the active site, triose phosphate isomerase has evolved a “locking” mechanism to seal the 
active site until the reaction is complete. When substrate binds, a flexible loop of the 
protein moves to cover the active site and prevent release of the enediol intermediate 
(Figure 6.9). 

The rate constants of all four kinetically measurable enzymatic steps have been 

(1) (2) (3) 

E + DHAP E-DHAP E-Intermediate 


E-G3P E + G3P 


6.4 Diffusion-Controlled Reactions 173 





▲ Figure 6.8 

General acid-base catalysis mechanism proposed for the 
reaction catalyzed by triose phosphate isomerase. 

▲ Figure 6.9 

Structure of yeast ( Saccharomyces cerevisiae) triose phosphate isomerase. The location of the substrate is indicated by the space-filling model of a sub- 
strate analog, (a) The structure of the “open loop” form of the enzyme when the active site is unoccupied, (b) The structure when the loop has closed 
over the active site to prevent release of the enediol intermediate before the reaction is completed. 

174 CHAPTER 6 Mechanisms of Enzymes 

Figure 6.10 ► 

Energy diagram for the reaction catalyzed by 
triose phosphate isomerase. [Adapted from 
Raines, R. T., Sutton, E. L., Strauss, D. R., 
Gilbert, W., and Knowles, J. R. (1986). 
Reaction energetics of a mutant triose 
phosphate isomerase in which the active- 
site glutamate has been changed to 
aspartate. Biochem. 25:7142-7154.] 

The energy diagram constructed from these rate constants is shown in Figure 6.10. 
Note that all the barriers for the enzyme are approximately the same height. This means 
that the steps are balanced, and no single step is rate-limiting. The physical step of S 
binding to E is rapid but not much faster than the subsequent chemical steps in the re- 
action sequence. The value of the second- order rate constant k cat /K m for the conversion 
of glyceraldehyde 3 -phosphate to dihydroxyacetone phosphate is 4 X 10 8 M -1 s _1 , 
which is close to the theoretical rate of a diffusion-controlled reaction. It appears that 
this isomerase has achieved its maximum possible efficiency as a catalyst. 


Much of our understanding of the mechanism of triose 
phosphate isomerase (TPI) comes from the lab of Jeremy 
Knowles at Harvard University (Cambridge, MA, USA). He 
points out that the enzyme has achieved catalytic perfection 
because the overall rate of the reaction is limited only by the 
rate of diffusion of substrate into the active site. TPI cant 
work any faster than this! 

This has led many people to declare that TPI is the 
“perfect enzyme” because it has evolved to be so efficient. 
However, as Knowles and his coworkers have explained, the 
“perfect enzyme” isn’t necessarily one that has evolved the 
maximum reaction rate. Most enzymes are not under selec- 
tive pressure to increase their rate of reaction because they 
are part of a metabolic pathway that meets the cell’s needs at 
less than optimal rates. 

Even if it would be beneficial to increase the overall flux in 
a pathway (i.e., produce more of the end product per second), 
an individual enzyme need only keep up with the slowest 
enzyme in the pathway in order to achieve “perfection.” The 
slowest enzyme might be catalyzing a very complicated reac- 
tion and might be very efficient. In this case, there will be no 
selective pressure on the other enzymes to evolve faster 
mechanisms and they are all “perfect enzymes.” 

In all species, triose phosphate isomerase is part of the 
gluconeogenesis pathway leading to the synthesis of glucose. 
In most species, it also plays a role in the reverse pathway 
where glucose is degraded (glycolysis). The enzyme is very 
ancient, and all versions — bacterial and eukaryotic — have 
achieved catalytic perfection. The two enzymes on either side of 
the reaction pathway, aldolase and glyceraldehyde 3 -phosphate 

dehydrogenase (Section 11.2), are much slower. Thus, it is by 
no means obvious why TPI works as fast as it does. 

The important point to keep in mind is that the vast majority 
of enzymes have not evolved catalytic perfection because their 
in vivo rates are “perfectly” adequate for the needs of the cell. 

▲ The Perfect Game. New York Yankees 
catcher Yogi Berra congratulates Don Larson 
for pitching a perfect game in the 1956 
World Series against the Brooklyn Dodgers. 
Perfect games are rare in baseball but there 
are many “perfect enzymes.” 

6.5 Modes of Enzymatic Catalysis 175 

B. Superoxide Dismutase 

Superoxide dismutase is an even faster catalyst than triose phosphate isomerase. Super- 
oxide dismutase catalyzes the very rapid removal of the toxic superoxide radical anion, 
•0 2 ®, a by-product of oxidative metabolism. The enzyme catalyzes the conversion of 
superoxide to molecular oxygen and hydrogen peroxide, which is rapidly removed by 
the subsequent action of enzymes such as catalase. 

4 H 




2 Superoxide 

2 0 2 

» 2 H 2 0 2 


2 H 2 0 + 0 2 


The reaction proceeds in two steps during which an atom of copper bound to the en- 
zyme is reduced and then oxidized. 

E-Cir + -Op * E-Cu© + o 2 


E-Cu© + -OP + 2H© -> E-ClT + H 2 0 2 (6.20) 

▲ Figure 6.1 1 

Surface charge on human superoxide dismu- 
tase. The structure of the enzyme is shown 
as a model that emphasizes the surface of 
the protein. Positively charged regions are 
colored blue and negatively charged regions 
are colored red. The copper atom at the 
active site is green. Note that the channel 
leading to the binding site is lined with 
positively charged residues. [PDB 1HL5] 

The overall reaction includes binding of the anionic substrate 
molecules, transfer of electrons and protons, and release of the 
uncharged products — all very rapid reactions with this en- 
zyme. The k cat /K m value for superoxide dismutase at 25°C is 
near 2 x 10 9 M -1 s -1 (Table 6.4). This rate is even faster than 
that expected for association of the substrate with the enzyme 
based on typical diffusion rates. 

How can the rate exceed the rate of diffusion? The expla- 
nation was revealed when the structure of the enzyme was ex- 
amined. An electric field around the superoxide dismutase 
active site enhances the rate of formation of the ES complex 
about 30-fold. As shown in Figure 6.11, the active-site copper 
atom lies at the bottom of a deep channel in the protein. Hy- 
drophilic amino acid residues at the rim of the active-site 
pocket guide negatively charged *oP to the positively 
charged region surrounding the active site. Electrostatic ef- 
fects allow superoxide dismutase to bind and remove super- 
oxide (radicals) much faster than expected from random 
collisions of enzyme and substrate. 

There are probably many enzymes with enhanced rates of 
binding due to electrostatic effects. In most cases, the rate-lim- 
iting step is catalysis so the overall rate ( k cat /K m ) is slower than 
the maximum for a diffusion-controlled reaction. For those 
enzymes with fast catalytic reactions, natural selection might favor rapid binding to en- 
hance the overall rate. Similarly, an enzyme with rapid binding might evolve a mecha- 
nism that favored a faster reaction. However, most biochemical reactions proceed at 
rates that are more than sufficient to meet the needs of the cell. 

6.5 Modes of Enzymatic Catalysis 

The quantitative effects of various catalytic mechanisms are difficult to assess. We have 
already seen two chemical mechanisms of enzymatic catalysis, acid-base catalysis and 
covalent catalysis. From studies of nonenzymatic catalysts it is estimated that acid-base 
catalysis can accelerate a typical enzymatic reaction by a factor of 10 to 100. Covalent 
catalysis can provide about the same rate acceleration. 

176 CHAPTER 6 Mechanisms of Enzymes 

▲ Figure 6.12 

Substrate binding. Di hydrofolate reductase 
binds NADP + (left) and folate (right), posi- 
tioning them in the active site in preparation 
for the reductase reaction. Most of the 
catalytic rate enhancement is due to binding 
effects. [PDB 7DFR] 

▲ Figure 6.13 

The proximity effect. The enzyme fructose-1, 6- 
b/sphosphate aldolase catalyzes the biosyn- 
thesis of fructose-1, 6-b/sphosphate from 
DHAP and G3P during gluconeogenesis and 
the cleavage of fructose-1, 6-b/sphosphate to 
dihydroxyacetone phosphate (DHAP) and 
glyceraldehyde-3-phosphate (G3P) during 
glycolysis (see Section 11.2#4). In the 
biosynthesis reaction, the two substrates 
DHAP and G3P must be positioned close 
together in the active site in an orientation 
that promotes their joining to form the larger 
fructose-1, 6-b/'sphosphate. This proximity 
effect is illustrated for the aldolase from 
Mycobacterium tuberculosis. [PDB 2EKZ] 

As important as these chemical modes are, they account for only a small portion 
of the observed rate accelerations achieved by enzymes (typically 10 8 to 10 12 ). The 
ability of proteins to specifically bind and orient ligands explains the remainder. 
The proper binding of reactants in the active sites of enzymes provides not only sub- 
strate and reaction specificity but also most of the catalytic power of enzymes 
(Figure 6.12). 

There are two catalytic modes based on binding phenomena. First, for multisub- 
strate reactions the collecting and correct positioning of substrate molecules in the 
active site raises their effective concentrations over their concentrations in free solution. 
In the same way, binding of a substrate near a catalytic active-site residue decreases the 
activation energy by reducing the entropy while increasing the effective concentrations 
of these two reactants. High effective concentrations favor the more frequent formation 
of transition states. This phenomenon is called the proximity effect. Efficient catalysis re- 
quires fairly weak binding of reactants to enzymes since extremely tight binding would 
inhibit the reaction. 

The second major catalytic mode arising from the ligand-enzyme interaction is the 
increased binding of transition states to enzymes compared to the binding of substrates 
or products. This catalytic mode is called transition state stabilization. There is an equi- 
librium (not the reaction equilibrium) between ES and the enzymatic transition state, 
ES*. Interaction between the enzyme and its ligands in the transition state shifts this 
equilibrium toward ES* and lowers the activation energy. 

The effects of proximity and transition- state stabilization were illustrated in Figure 6.3. 
Experiments suggest that proximity can increase reaction rates more than 10,000-fold, 
and transition-state stabilization can increase reaction rates at least that much. Enzymes 
can achieve extraordinary rate accelerations when both of these effects are multiplied by 
chemical catalytic effects. 

The binding forces responsible for formation of ES complexes and for stabilization 
of ES* are familiar from Chapters 2 and 4. These weak forces are charge-charge interac- 
tions, hydrogen bonds, hydrophobic interactions, and van der Waals forces. Charge-charge 
interactions are stronger in nonpolar environments than in water. Because active sites 
are largely nonpolar, charge-charge interactions in the active sites of enzymes can be 
quite strong. The side chains of aspartate, glutamate, histidine, lysine, and arginine 
residues provide negative and positive groups that form ion pairs with substrates in 
active sites. Next in bond strength are hydrogen bonds that often form between 
substrates and enzymes. The peptide backbone and the side chains of many amino 
acids can form hydrogen bonds. Highly hydrophobic amino acids, as well as alanine, 
proline, tryptophan, and tyrosine, can participate in hydrophobic interactions with 
the nonpolar groups of ligands. Many weak van der Waals interactions also help bind 
substrates. Keep in mind that both the chemical properties of the amino acid residues 
and the shape of the active site of an enzyme determine which substrates will bind. 

A. The Proximity Effect 

Enzymes are frequently described as entropy traps — agents that collect highly mobile 
reactants from dilute solution thereby decreasing their entropy and increasing the prob- 
ability of their interaction. You can think of the reaction of two molecules positioned at 
the active site as an intramolecular (unimolecular) reaction. The correct positioning of 
two reacting groups in the active site reduces their degrees of freedom and produces a 
large loss of entropy sufficient to account for a large rate acceleration (Figure 6.13). The 
acceleration is expressed in terms of the enhanced relative concentration, called the 
effective molarity , of the reacting groups in the unimolecular reaction. The effective mo- 
larity can be obtained from the ratio 

Effective molarity 

Ms" 1 ) 

k 2 ( M _1 s' 1 ) 

( 6 . 21 ) 

6.5 Modes of Enzymatic Catalysis 177 

where k x is the rate constant when the reactants are preassembled into a single molecule 
and k 2 is the rate constant of the corresponding bimolecular reaction. All the units in this 
equation cancel except M, so the ratio is expressed in molar units. Effective molarities are 
not real concentrations; in fact, for some reactions the values are impossibly high. Never- 
theless, effective molarities indicate how favorably reactive groups are oriented. 

The importance of the proximity effect is illustrated by experiments comparing 
a nonenzymatic bimolecular reaction to a series of chemically similar intramolecular 
reactions (Figure 6.14). The bimolecular reaction was the two-step hydrolysis of 
p-bromophenyl acetate, catalyzed by acetate and proceeding via the formation of 
acetic anhydride. (The second step, hydrolysis of acetic anhydride, is not shown in 
Figure 6.14.) In the unimolecular version, reacting groups were connected by a bridge 
with progressively greater restriction of rotation. With each restriction placed on 
the substrate molecules, the relative rate constant (ki/k 2 ) increased markedly. The glu- 
tarate ester (compound 2) has two bonds that allow rotational freedom whereas the 
succinate ester (compound 3) has only one. The most restricted compound, the rigid 
bicyclic compound 4, has no rotational freedom. In this compound, the carboxylate is 

v Figure 6.14 

Reactions of a series of carboxylates with 
substituted phenyl esters. The proximity 
effect is illustrated by the increase in rate 
observed when the reactants are held more 
rigidly in proximity. Reaction 4 is 50 million 
times faster than Reaction 1, the bimolecu- 
lar reaction. [Based on Bruice and Pandit 
(1960). Intramolecular models depicting 
the kinetic importance of “fit” in enzymatic 
catalysis. Biochem. 46:402-404.] 


hUC — Cc~0 




h 3 c — c — o 


Relative rate 

2 . 

H 2 C 

H 2 C — C — O u 




h 2 c — c 
7 \ 

h 2 c o 

\ / 

h 2 c — c 


1 x 10 3 






h 2 c^ c \ 

h 2 L c / C 



2 x 10 5 


178 CHAPTER 6 Mechanisms of Enzymes 


The correct binding and positioning of 
specific substrates in the active site of an 
enzyme produces a large acceleration in 
the rate of a reaction. 

close to the ester and the reacting groups are properly aligned. The effective molarity of 
the carboxylate group is 5 x 10 7 M. Compound 4 has an extremely high probability of 
reaction because very little entropy must be lost to reach the transition state. Theoreti- 
cal considerations suggest that the greatest rate acceleration that can be expected from 
the proximity effect is about 10 8 . This entire rate acceleration can be attributed to the 
loss of entropy that occurs when two reactants are properly positioned for reaction. 
These intramolecular reactions can serve as a model of the positioning of two substrates 
bound in the active site of an enzyme. 

B. Weak Binding of Substrates to Enzymes 

Reactions of ES complexes are analogous to unimolecular reactions even when two 
substrates are involved. Although the correct positioning of substrates in an active site 
produces a large rate acceleration, enzymes do not achieve the maximum 10 8 accelera- 
tion theoretically generated by the proximity effect. Typically, the loss in entropy on 
binding of the substrate allows an acceleration of only 10 4 . That’s because in ES com- 
plexes the reactants are brought toward, but not extremely close to, the transition state. 
This conclusion is based on both mechanistic reasoning and measurements of the 
tightness of binding of substrates and inhibitors to enzymes. One major limitation is 
that binding of substrates to enzymes cannot be extremely tight; that is, K m values can- 
not be extremely low. 

Figure 6.15 shows energy diagrams for a nonenzymatic unimolecular reaction and 
the corresponding multistep enzyme -catalyzed reaction. As we will see in the next sec- 
tion, an enzyme increases the rate of a reaction by stabilizing (i.e., tightly binding) the 
transition state. Therefore, the energy required for ES to reach the transition state (ES*) 
in the enzymatic reaction is less than the energy required for S to reach S*, the transition 
state in the nonenzymatic reaction. 

Recall that the substrate must be bound fairly weakly in the ES complex. If a sub- 
strate were bound extremely tightly, it could take just as much energy to reach ES* from 
ES (the arrow labeled 2) as is required to reach S* from S in the nonenzymatic reaction 
(the arrow labeled 1). In other words, extremely tight binding of the substrate would 
mean little or no catalysis. Excessive ES stability is a thermodynamic pit. The role of en- 
zymes is to bind and position substrates before the transition state is reached but not so 
tightly that the ES complex is too stable. 

The K m values (representing dissociation constants) of enzymes for their substrates 
show that enzymes avoid the thermodynamic pit. Most K m values are on the order of 
10 -4 M, a number that indicates weak binding of the substrate. Enzymes specific for 
small substrates, such as urea, carbon dioxide, and superoxide anion, exhibit relatively 
high K m values for these compounds (10 -3 to 10 -4 M) because these molecules can 
form few noncovalent bonds with enzymes. Enzymes typically have low K m values 

Figure 6.15 ► 

Energy of substrate binding. In this hypotheti- 
cal reaction, the enzyme accelerates the 
rate of the reaction by stabilizing the transi- 
tion state. In addition, the activation barrier 
for formation of the transition state ES* 
from ES must be relatively low. If the en- 
zyme bound the substrate too tightly 
(dashed profile), the activation barrier (2) 
would be comparable to the activation bar- 
rier of the nonenzymatic reaction (1). 

Reaction coordinates 

6.5 Modes of Enzymatic Catalysis 179 

(10 -6 to 10 -5 M) for coenzymes, which are bulkier than many substrates. The K m values 
for the binding of ATP to most ATP- requiring enzymes are about 1(T 4 M or greater but 
the muscle-fiber protein myosin (which is not an enzyme) binds ATP a billionfold more 
avidly. This large difference in binding reflects the fact that in an ES complex not all 
parts of the substrate are bound. 

When the concentration of a substrate inside a cell is below the K m value of its corre- 
sponding enzyme, the equilibrium of the binding reaction E + S v ES favors E + S. In 
other words, the formation of the ES complex is slightly uphill energetically (Figures 6.3 
and 6.15), and the ES complex is closer to the energy of the transition state than the 
ground state is. This weak binding of substrates accelerates reactions. K m values ap- 
pear to be optimized by evolution for effective catalysis — low enough that proximity is 
achieved, but high enough that the ES complex is not too stable. The weak binding of 
substrates is an important feature of another major force that drives enzymatic catalysis — 
increased binding of reactants in the ES^ transition state. 

C. Induced Fit 

Enzymes resemble solid catalysts by having limited flexibility but they are not entirely 
rigid molecules. The atoms of proteins are constantly making small, rapid motions, and 
small conformational adjustments occur on binding of ligands. An enzyme is most 
effective if it is in the active form initially so no binding energy is consumed in convert- 
ing it to an active conformation. In some cases, however, enzymes undergo major shape 
alterations when substrate molecules bind. The enzyme shifts from an inactive to an 
active form. Activation of an enzyme by a substrate-initiated conformation change is 
called induced fit. Induced fit is not a catalytic mode but primarily a substrate specificity 

One example of induced fit is seen with hexokinase, an enzyme that catalyzes the 
phosphorylation of glucose by ATP: 

Glucose + ATP Glucose 6-phosphate + ADP (6.22) 

Water (HOH), which resembles the alcoholic group at C-6 of glucose (ROH), is small 
enough and of the proper shape to fit into the active site of hexokinase and therefore it 
should be a good substrate. If water entered the active site, hexokinase would quickly 
catalyze the hydrolysis of ATP. However, hexokinase -catalyzed hydrolysis of ATP was 
shown to be 40,000 times slower than phosphorylation of glucose. 

How does the enzyme avoid nonproductive hydrolysis of ATP in the absence of 
glucose? Structural experiments with hexokinase show that the enzyme exists in two 
conformations: an open form when glucose is absent, and a closed form when glucose is 
bound. The angle between the two domains of hexokinase changes considerably when 
glucose binds, closing the cleft in the enzyme-glucose complex (Figure 6.16). Produc- 
tive hydrolysis of ATP can only take place in the closed form of the enzyme where the 
newly formed active site is already occupied by glucose. Water is not a large enough sub- 
strate to induce a change in the conformation of hexokinase and this explains why 
water does not stimulate ATP hydrolysis. Thus, sugar- induced closure of the hexokinase 
active site prevents wasteful hydrolysis of ATP. A number of other kinases follow 
induced fit mechanisms. 

The substrate specificity that occurs with the induced fit mechanism of hexokinase 
economizes cellular ATP but exacts a catalytic price. The binding energy consumed in 
moving the protein molecule into the closed shape — a less-favored conformation — is 
energy that cannot be used for catalysis. Consequently, an enzyme that uses an induced 
fit mechanism is less effective as a catalyst than a hypothetical enzyme that is always in 
an active shape and catalyzes the same reaction. The catalytic cost of induced fit slows 
kinases so that their /c cat values are approximately 10 3 s -1 (Table 5.1). We will see an- 
other example of induced fit and how it economizes metabolic energy in Section 13.3#1 
when we describe citrate synthase. The loop-closing reaction of triose phosphate iso- 
merase is also an example of an induced fit binding mechanism. 

The meaning of K m is discussed in 
Section 5.3C. In most cases, it repre- 
sents a good approximation of the 
dissociation constant for the reaction 
E + S ES. Thus, a K m of 10” 4 M 
means that at equilibrium the concen- 
tration of ES will be approximately 
10,000-fold higher than the concentra- 
tion of free substrate. 

▲ Figure 6.16 

Yeast hexokinase. Yeast hexokinase contains 
two structural domains connected by a 
hinge region. On binding of glucose, these 
domains close, shielding the active site from 
water, (a) Open conformation, (b) Closed 
conformation. [PDB 2YHX and 1HKG]. 

180 CHAPTER 6 Mechanisms of Enzymes 


Most enzymes exhibit some form of 
induced fit binding mechanism. 

Hexokinase, citrate synthase, and triose phosphate isomerase are extreme exam- 
ples of induced fit mechanisms. Recent advances in the study of enzyme structures re- 
veal that almost all enzymes undergo some conformational change when substrate 
binds. The simple concept of a rigid lock and a rigid key is being replaced by a more dy- 
namic interaction where both the “lock” (enzyme) and the “key” (substrate) adjust to 
each other to form a perfect match. 


The catalytic power of enzymes is 
explained by binding effects (positioning 
the substrates together in the correct 
orientation) and stabilization of the 
transition state. The result is a lower 
activation energy and an increased rate 
of reaction. 

The role of adenosine deaminase is 
described in Section 18.8. 

D. Transition-State Stabilization 

Enzymes catalyze reactions by physically or electronically distorting the structures of 
substrates making them similar to the transition state of the reaction. Transition- state 
stabilization — the increased interaction of the enzyme with the substrate in the transi- 
tion state — explains a large part of the rate acceleration of enzymes. 

Recall Emil Fischer’s lock-and-key theory of enzyme specificity described in Sec- 
tion 5.2B. Fischer proposed that enzymes were rigid templates that accepted only cer- 
tain substrates as keys. This idea has been replaced by a more dynamic model where 
both enzyme and substrate change conformations when they interact. Furthermore, 
the classic lock-and-key model dealt with the interaction between enzyme and sub- 
strate but we now think of it in terms of enzyme and transition state — the “key” in the 
“lock” is the transition state and not the substrate molecule. When a substrate binds to 
an enzyme the enzyme distorts the structure of the substrate forcing it toward the 
transition state. Maximal interaction with the substrate molecule occurs only in ES^. A 
portion of this binding in ES^ can be between the enzyme and nonreacting portions of 
the substrate. 

An enzyme must be complementary to the transition state in shape and in chemi- 
cal character. The graph in Figure 6.15 shows that tight binding of the transition state to 
an enzyme can lower the activation energy. Because the energy difference between E + S 
and ES^ is significantly less than the energy difference between S and S*, fc cat is greater 
than k n (the rate constant for the nonenzymatic reaction). The enzyme-substrate tran- 
sition state (ES*) is lower in absolute energy — and therefore more stable — than the tran- 
sition state of the reactant in the uncatalyzed reaction. Some transition states may bind 
to their enzymes 10 10 to 10 15 times more tightly than their substrates do. The affinity of 
other enzymes for their transition states need not be that extreme. A major task for bio- 
chemists is to show how transition state stabilization occurs. 

The comparative stabilization of ES^ could occur if an enzyme has an active site 
with a shape and an electrostatic structure that more closely fits the transition state than 
the substrate. An undistorted substrate molecule would not be fully bound. For exam- 
ple, an enzyme could have sites that bind the partial charges present only in the unstable 
transition state. 

Transition- state molecules are ephemeral — they have very short half-lives and are 
difficult to detect. One way in which biochemists can study transition states is to create 
stable analogs that can bind to the enzyme. These transition-state analogs are molecules 
whose structures resemble presumed transition states. If enzymes prefer to bind to tran- 
sition states, then a transition- state analog should bind extremely tightly to the appro- 
priate enzyme — much more tightly than substrate — and thus be a potent inhibitor. The 
dissociation constant for a transition state analog should be about 10 -13 M or less. 

One of the first examples of a transition-state analog was 2-phosphoglycolate 
(Figure 6.17), whose structure resembles the first enediolate transition state in the reac- 
tion catalyzed by triose phosphate isomerase (Section 6.4A). This transition- state ana- 
log binds to the isomerase at least 100 times more tightly than either of the substrates of 
the enzyme (Figure 6.18). Tighter binding results from a partially negative oxygen atom 
in the carboxylate group of 2-phosphoglycolate, a feature shared with the transition 
state but not with the substrates. 

Experiments with adenosine deaminase have identified a transition- state analog 
that binds to the enzyme with amazing affinity because it resembles the transition state 
very closely. Adenosine deaminase catalyzes the hydrolytic conversion of the purine nu- 
cleoside adenosine to inosine. The first step of this reaction is the addition of a molecule 

6.5 Modes of Enzymatic Catalysis 181 


H0 \ /Opo 3 © 

C CH2 

H H 


H0 \^ c \ /OPo 3 © 
C CH2 




Transition state 

opo 3 © 

O CH2 

(transition-state analog) 


I ^ 

HO^ ^0P0 3 © 



Enediol intermediate 

◄ Figure 6.17 

2-Phosphoglycolate, a transition-state 
analog for the enzyme triose phosphate 
isomerase. 2-Phosphoglycolate is pre- 
sumed to be an analog of C-2 and C-3 
of the transition state (center) between 
dihydroxyacetone phosphate (right) and 
the initial enediolate intermediate in 
the reaction. 

of water (Figure 6.19a). The complex with water, called a covalent hydrate, forms as soon 
as adenosine is bound to the enzyme and quickly decomposes to products. Adenosine 
deaminase has broad substrate specificity and catalyzes the hydrolytic removal of vari- 
ous groups from position 6 of purine nucleosides. However, the inhibitor purine ri- 
bonucleoside (Figure 6.19b) has just hydrogen at position 6 and undergoes only the first 
enzymatic step of hydrolysis, addition of the water molecule. The covalent hydrate that’s 
formed is a transition- state analog, a competitive inhibitor having a FQ of 3 X 10 -13 M. 
(For comparison, the affinity constant of adenosine deaminase for its true transition 
state is expected to be 3 X 10 -17 M.). The binding of this analog exceeds the binding of 
either the substrate or the product by a factor of more than 10 8 . A very similar reduced 
inhibitor, 1,6-dihydropurine ribonucleoside (Figure 6.19c), lacks the hydroxyl group at 
C-6, and it has a K x of only 5 x 10~ 6 M. We can conclude from these studies that adenosine 

◄ Figure 6.18 

Binding of 2-phosphoglycolate to triose phos- 
phate isomerase. The transition state ana- 
logue, 2-phosphoglycolate is bound at the 
active site of Plasmodium falciparum triose 
phosphate isomerase. The molecule is held 
in position by many hydrogen bonds between 
the phosphate group and surrounding amino 
acid side chains. Some of the hydrogen 
bonds are formed through bridged “frozen” 
water molecules in the active site. The 
catalytic residues, Glu-165 and His-95, 
form hydrogen bonds with the carboxylate 
group of 2-phosphoglycolate as expected in 
the transition state. [PDB 1LYZ] 

Glu 165 

182 CHAPTER 6 Mechanisms of Enzymes 

H 2 N OH 







Adenosine Covalent hydrate 






Purine ribonucleoside 
(substrate analog) 



1,6-Dihydropurine ribonucleoside 
(competitive inhibitor) 

▲ Figure 6.20 

Adenosine deaminase with bound transition- 
state analog. 

▲ Figure 6.19 

Inhibition of adenosine deaminase by a transition-state analog, (a) In the deamination 
of adenosine, a proton is added to N-l and a hydroxide ion is added to C-6 to form 
an unstable covalent hydrate, which decomposes to produce inosine and ammonia, 
(b) The inhibitor purine ribonucleoside also rapidly forms a covalent hydrate, 6-hy- 
droxy-l,6-dihydropurine ribonucleoside. This covalent hydrate is a transition-state 
analog that binds more than a million times more avidly than another competitive 
inhibitor, 1,6-dihydropurine ribonucleoside (c), which differs from the transition- 
state analog only by the absence of the 6-hydroxyl group. 

deaminase must specifically and avidly bind the transition-state analog — 
and also the transition state — through interaction with the hydroxyl 
group at C-6. 

The structure of adenosine deaminase with the bound transition- 
state analog is shown in Figure 6.20 and the interactions between the 
analog and amino acid side chains in the active site are depicted in Figure 
6.21. Notice the hydrogen bonds between Asp-292 and the hydroxyl 
group on C-6 of 6-hydroxy- 1,6-dihydropurine and the interaction be- 
tween this hydroxyl group and a bound zinc ion in the active site. This 
confirms the hypothesis that the enzyme specifically binds the transition 
state in the normal reaction. 






His12 Hisl 5 

Asp296 \ / H j s2 1 1 




^ Zn 2 T. 


N HO— - 

\ NH-__ 












\ / NH^ 

Wat 569 G ^ 81 ^ 

▲ Figure 6.21 

Transition-state analog binding to adenosime deaminase. The interactions between the transition state 
analog, 6-hydroxy-l,6-dihydropurine, and amino acid side chains in the active site of adenosine 
deaminase confirms that the enyme recognizes the hydroxyl group at C-6. [PDB 1KRM] 

6.6 Serine Proteases 


6.6 Serine Proteases 

Serine proteases are a class of enzymes that cleave the peptide bond of proteins. As the 
name implies, they are characterized by the presence of a catalytic serine residue in their 
active sites. The best-studied serine proteases are the related enzymes trypsin, chy- 
motrypsin, and elastase. These enzymes provide an excellent opportunity to explore the 
relationship between protein structure and catalytic function. They have been inten- 
sively studied for 50 years and form an important part of the history of biochemistry 
and the elucidation of enzyme mechanisms. In this section, we see how the activity of 
serine proteases is regulated by zymogen activation and examine a structural basis for 
the substrate specificity of different serine proteases. 

A. Zymogens Are Inactive Enzyme Precursors 

Mammals digest food in the stomach and intestines. During this process, food pro- 
teins undergo a series of hydrolytic reactions as they pass through the digestive tract. 
Following mechanical disruption by chewing and moistening with saliva, foods are 
swallowed and mixed with hydrochloric acid in the stomach. The acid denatures pro- 
teins and pepsin (a protease that functions optimally in an acidic environment) cat- 
alyzes hydrolysis of these denatured proteins to a mixture of peptides. The mixture 
passes into the intestine where it is neutralized by sodium bicarbonate and digested 
by the action of several proteases to amino acids and small peptides that can be ab- 
sorbed into the bloodstream. 

Pepsin is initially secreted as an inactive precursor called pepsinogen. When 
pepsinogen encounters HC1 in the stomach it is activated to cleave itself forming the 
more active protease, pepsin. The stomach secretions are stimulated by food — or even 
the anticipation of food — as shown by Ivan Pavlov in his experiments with dogs over 
100 years ago. (Pavlov was awarded a Nobel Prize in 1904.) The inactive precursor is 
called a zymogen. Pavlov was the first to show that zymogens could be converted to ac- 
tive proteases in the stomach and intestines. 

The main serine proteases are trypsin, chymotrypsin, and elastase. Together, they 
catalyze much of the digestion of proteins in the intestine. Like pepsin, these enzymes are 
also synthesized and stored as inactive precursors called zymogens. The zymogens, are called 
trypsinogen, chymotrypsinogen, and proelastase. They are synthesized in the pancreas. 
Its important to store these hydrolytic enzymes as inactive precursors within the cell since 
the active proteases would kill the pancreatic cells by cleaving cytoplasmic proteins. 


1. Rely on enzymology to clarify biologic questions 

2. Trust the universality of biochemistry and the power of microbiology 

3. Do not believe something because you can explain it 

4. Do not waste clean thinking on dirty enzymes 

5. Do not waste clean enzymes on dirty substrates 

6. Depend on viruses to open windows 

7. Correct for extract dilution with molecular crowding 

8. Respect the personality of DNA 

9. Use reverse genetics and genomics 

10. Employ enzymes as unique reagents 

Arthur Kornberg, Nobel Laureate in Physiology or Medicine 1959 

Kornberg, A. (2000). Ten commandments: lessons from the enzymology of DNA replication. 
/. Bacteriol. 182:3613-3618. 

Kornberg, A. (2003). Ten commandments of enzymology, amended. Trends Biochem. Sci. 

184 CHAPTER 6 Mechanisms of Enzymes 



Trypsin ^ 



V + 


i Proelastase 


◄ x v — ► 

+ + 


▲ Figure 6.22 

Activation of some pancreatic zymogens. 

Initially, enteropeptidase catalyzes the acti- 
vation of trypsinogen to trypsin. Trypsin then 
activates chymotrypsinogen, proelastase, 
and additional trypsinogen molecules. 



▲ Figure 6.23 

Polypeptide chains of chymotrypsinogen (left) 
[PDB 2CGA] and a-chymotrypsin (right) [PDB 
5CHA]. lle-16 and Asp-194 in both zymogen 
and the active enzyme are shown in yellow. 
The catalytic-site residues (Asp-102, His- 
57, and Ser-195) are shown in red. The 
residues that are removed by processing the 
zymogen are colored green. 

The enzymes are activated by selective proteolysis — enzymatic cleavage of one or a 
few specific peptide bonds — when they are secreted from the pancreas into the small in- 
testine. A protease called enteropeptidase specifically activates trypsinogen to trypsin by 
catalyzing cleavage of the bond between Lys-6 and Ile-7. Once activated by the removal of 
its N-terminal hexapeptide, trypsin proteolytically cleaves the other pancreatic zymogens, 
including additional trypsinogen molecules (Figure 6.22). 

The activation of chymotrypsinogen to chymotrypsin is catalyzed by trypsin and 
by chymotrypsin itself. Four peptide bonds (between residues 13 and 14, 15 and 16, 146 
and 147, and 148 and 149) are cleaved resulting in the release of two dipeptides. The result- 
ing chymotrypsin retains its three-dimensional shape, despite two breaks in its back- 
bone. This stability is partly due to the presence of five disulfide bonds in the protein. 

X-ray crystallography has revealed one major difference between the conformation 
of chymotrypsinogen and chymotrypsin — the lack of a hydrophobic substrate-binding 
pocket in the zymogen. The differences are shown in Figure 6.23 where the structures of 
chymotrypsinogen and chymotrypsin are compared. On zymogen activation, the newly 
generated a-amino group of lie- 16 turns inward and interacts with the /3 -carboxyl 
group of Asp- 194 to form an ion pair. This local conformational change generates a rel- 
atively hydrophobic substrate-binding pocket near the three catalytic residues with ion- 
izable side chains (Asp- 102, His- 57, and Ser-195). 

B. Substrate Specificity of Serine Proteases 

Chymotrypsin, trypsin, and elastase are similar enzymes that share a common ancestor; 
in other words, they are homologous. Each enzyme has a two-lobed structure with the 
active site located in a cleft between the two domains. The positions of the catalytically 
active side chains of the serine, histidine, and aspartate residues in the active sites are 
almost identical in the three enzymes (Figure 6.24). 

The substrate specificities of chymotrypsin, trypsin, and elastase have been 
explained by relatively small structural differences in the enzymes. Recall that trypsin 
catalyzes the hydrolysis of peptide bonds whose carbonyl groups are contributed by 
arginine or lysine (Section 3.10). Both chymotrypsin and trypsin contain a binding 
pocket that correctly positions the substrates for nucleophilic attack by an active-site 
serine residue. Each protease has a similar extended region into which polypeptides fit 
but the so-called specificity pocket near the active- site serine is markedly different for 
each enzyme. Trypsin differs from chymotrypsin because in chymotrypsin there is an 
uncharged serine residue at the base of the hydrophobic binding pocket. In trypsin this 
residue is an aspartate residue (Figure 6.25). This negatively charged aspartate residue 
forms an ion pair with the positively charged side chain of an arginine or lysine residue 
of the substrate in the ES complex. Experiments with specifically mutated trypsin indicate 
that the aspartate residue at the base of its specificity pocket is a major factor in sub- 
strate specificity but other parts of the molecule also affect specificity. 

Elastase catalyzes the degradation of elastin, a fibrous protein that is rich in glycine 
and alanine residues. Elastase is similar in tertiary structure to chymotrypsin except that 

(a) (b) (c) 

▲ Figure 6.24 

Serine proteases. Comparison of the polypeptide backbones of (a) chymotrypsin [PDB 5CHA], (b) trypsin 
[PDB 1TLD], and (c) elastase [PDB 3EST]. Residues at the catalytic center are shown in red. 

6.6 Serine Proteases 185 

(a) Chymotrypsin 


(b) Trypsin 

• Carbon 
O Nitrogen 

# Oxygen 

◄ Figure 6.25 

Binding sites of chymotrypsin, trypsin, and 
elastase. The differing binding sites of these 
three serine proteases are primary determinants 
of their substrate specificities, (a) Chymotrypsin 
has a hydrophobic pocket that binds the side 
chains of aromatic or bulky hydrophobic amino 
acid residues, (b) A negatively charged as- 
partate residue at the bottom of the binding 
pocket of trypsin allows trypsin to bind the 
positively charged side chains of lysine and 
arginine residues, (c) In elastase, the side 
chains of a valine and a threonine residue at 
the binding site create a shallow binding 
pocket. Elastase binds only amino acid 
residues with small side chains, especially 
glycine and alanine residues. 

its binding pocket is much shallower. Two glycine residues found at the entrance of the 
binding site of chymotrypsin and trypsin are replaced in elastase by much larger valine 
and threonine residues (Figure 6.25c). These residues keep potential substrates with 
large side chains away from the catalytic center. Thus, elastase specifically cleaves pro- 
teins that have small residues such as glycine and alanine. 

C. Serine Proteases Use Both the Chemical 
and the Binding Modes of Catalysis 

Let s examine the mechanism of chymotrypsin and the roles of three catalytic residues: 
His-57, Asp- 102, and Ser- 195. Many enzymes catalyze the cleavage of amide or ester 
bonds by the same process so study of the chymotrypsin mechanism can be applied to a 
large family of hydrolases. 

Asp- 102 is buried in a rather hydrophobic environment. It is hydrogen-bonded to 
His-57 that in turn is hydrogen-bonded to Ser- 195 (Figure 6.26 ). This group of amino acid 
residues is called the catalytic triad. The reaction cycle begins when His-57 abstracts a pro- 
ton from Ser-195 (Figure 6.27). This creates a powerful nucleophile (Ser-195) that will 
eventually attack the peptide bond. Initiation of this part of the reaction is favored because 
Asp- 102 stabilizes the histidine promoting its ability to deprotonate the serine residue. 

The discovery that Ser-195 is a catalytic residue of chymotrypsin was surprising be- 
cause the side chain of serine is usually not sufficiently acidic to undergo deprotonation 
in order to serve as a strong nucleophile. The hydroxymethyl group of a serine residue 
generally has a p K a of about 16 and is similar in reactivity to the hydroxyl group of 
ethanol. You may recall from organic chemistry that although ethanol can ionize to 

▲ Figure 6.26 

The catalytic site of chymotrypsin. The active- 
site residues Asp-102, His-57, and Ser-195 
are arrayed in a hydrogen-bonded network. 
The conformation of these three residues is 
stabilized by a hydrogen bond between the 
carbonyl oxygen of the carboxylate side 
chain of Asp-102 and the peptide-bond 
nitrogen of His-57. Oxygen atoms of the 
active-site residues are red, and nitrogen 
atoms are dark blue. [PDB 5CHA]. 








ch 2 







-ch 2 


, 0 + 

.H^ ^ ^H| 




ch 2 

▲ Figure 6.27 

Catalytic triad of chymotrypsin. The imidazole ring of His-57 removes the proton from the hydroxymethyl side chain of Ser-195 (to 
which it is hydrogen-bonded), thereby making Ser-195 a powerful nucleophile. This interaction is facilitated by interaction of the 
imidazolium ion with its other hydrogen-bonded partner, the buried /3-carboxylate group of Asp-102. The residues of the triad are 
drawn in an arrangement similar to that shown in Figure 6.24. 

186 CHAPTER 6 Mechanisms of Enzymes 


Its a little-known fact that 75% of all laundry detergents contain 
proteases that are used in helping to remove stubborn protein- 
based stains from dirty clothes. 

All protease additives are based on serine proteases iso- 
lated from various Bacillus species. These enzymes have been 
extensively modified in order to be active under the harsh 
conditions of a detergent solution at high temperature. A 
successful example of site-directed mutagenesis is the alter- 
ation of the serine protease subtilisin from Bacillus subtilis 
(Box 6.4) to make it more resistant to chemical oxidation. It 
has a methionine residue in the active-site cleft (Met-222) 
that is readily oxidized leading to inactivation of the enzyme. 
Resistance to oxidation increases the suitability of subtilisin 
as a detergent additive. Met-222 was systematically replaced 
by each of the other common amino acids in a series of mu- 
tagenic experiments. All 19 possible mutant subtilisins were 
isolated and tested and most had greatly diminished peptidase 
activity. The Cys-222 mutant had high activity but was also 
subject to oxidation. The Ala-222 and Ser-222 mutants, with 
nonoxidizable side chains, were not inactivated by oxidation 
and had relatively high activity. They were the only active, 
oxygen- stable mutant subtilisin variants. 

Site-directed mutagenesis has been performed to alter 
eight of the 319 amino acid residues of a bacterial protease. 

The wild-type protease is moderately stable when heated but 
the suitably mutated enzyme is stable and can function at 
100°C. Its denaturation in detergent is prevented by groups, 
such as a disulfide bridge, that stabilize its conformation. 

Recently there has been a trend to lower wash tempera- 
tures in order to save energy. The older group of enzymes are 
not effective at lower wash temperatures so a whole new 
round of bioengineering has begun creating modified en- 
zymes that can be effective in a modern energy-conscious 

form an ethoxide this reaction requires the presence of an extremely strong base or 
treatment with an alkali metal. We see below how the active site of chymotrypsin, 
achieves this ionization in the presence of a substrate. 

A proposed mechanism for chymotrypsin and related serine proteases includes co- 
valent catalysis (by a nucleophilic oxygen) and general acid-base catalysis (donation of 
a proton to form a leaving group). The steps of the proposed mechanism are illustrated 
in Figure 6.28. 

Binding of the peptide substrate causes a slight conformation change in chy- 
motrypsin, sterically compressing Asp- 102 and His-57. A low-barrier hydrogen bond is 
formed between these side chains and the p K a of His-57 rises from about 7 to about 11. 
(Formation of this strong, almost covalent, bond drives electrons toward the second N 
atom of the imidazole ring of His-57 making it more basic.) This increase in basicity 
makes His-57 an effective general base for abstracting a proton from the — CH 2 OH of 
Ser-195. This mechanism explains how the normally unreactive alcohol group of serine 
becomes a potent nucleophile. 

All the catalytic modes described in this chapter are used in the mechanisms of ser- 
ine proteases. In the reaction scheme shown in Figure 6.28, steps 1 and 4 in the forward 
direction use the proximity effect, the gathering of reactants. For example, when a water 
molecule replaces the amine (Pi) in step 4, it is held by histidine, providing a proximity 
effect. Acid-base catalysis by histidine lowers the energy barriers for steps 2 and 4. Co- 
valent catalysis using the — CH 2 OH of serine occurs in steps 2 through 5. The unstable 
tetrahedral intermediates at steps 2 and 4 (E-TIi and E-TI 2 ) are believed to be similar to 
the transition states for these steps. Hydrogen bonds in the oxyanion hole stabilize these 
intermediates, which are oxyanion forms of the substrate, by binding them more tightly 
to the enzyme than the substrate was bound. The chemical modes of catalysis 
(acid-base and covalent catalysis) and the binding modes of catalysis (the proximity ef- 
fect and transition-state stabilization) all contribute to the enzymatic activity of serine 

6.6 Serine Proteases 187 


The protease subtilisin from the bacterium Bacillus subtilis is 
another example of a serine protease. It possesses a catalytic 
triad consisting of Asp-32, His-64, and Ser-221 at its active 
site. These are arranged in an alignment similar to the Asp- 102, 
His-57, and Ser-195 residues in chymotrypsin (Figure 6.27). 
However, as you might deduce from the residue numbers, 
the structures of subtilisin and chymotrypsin are very differ- 
ent and there is no significant sequence similarity. 

This is a remarkable example of convergent evolution. 
The mammalian intestinal serine proteases and the bacterial 
subtilisins have independently discovered the catalytic Asp- 
His-Ser triad. 

► Subtilisin from Bacillus 
subtilis. The structure 
of this enzyme is very 
different from that of 
serine proteases shown in 
Figure 6.24. [PDB 1SBC] 

6.7 Lysozyme 

Lysozyme catalyzes the hydrolysis of some polysaccharides, especially those that make 
up the cell walls of bacteria. It is the first enzyme whose structure was solved and for 
this reason there has been a long-term interest in working out its precise mechanism of 
action. Many secretions, such as tears, saliva, and nasal mucus, contain lysozyme activ- 
ity to help prevent bacterial infection. (Lysozyme causes lysis , or disruption, of bacterial 
cells.) The best-studied lysozyme is from chicken egg white. 

The substrate of lysozyme is a polysaccharide composed of alternating residues 
of N-acetylghicosamine (GlcNAc) and N-acetylmuramic acid (MurNAc) connected y^g structure of bacterial cell walls is 

by glycosidic bonds (Figure 6.29). Lysozyme specifically catalyzes hydrolysis of the described in Seciton 8 7B 

glycosidic bond between C-l of a MurNAc residue and the oxygen atom at C-4 of a 
GlcNAc residue. 

Models of lysozyme and its complexes with saccharides have been obtained by 
X-ray crystallographic analysis (Figure 6.30). The substrate-binding cleft of lysozyme 
accommodates six saccharide residues. Each of the residues binds to a particular part of 
the active cleft at sites A through E. 

Sugar molecules fit easily into all but one site of the structural model. At site D a 
sugar molecule such as MurNAc does not fit into the model unless it is distorted into a 


◄ Figure 6.29 

Structure of a four-residue portion of a bacter- 
ial cell-wall polysaccharide. Lysozyme cat- 
alyzes hydrolytic cleavage of the glycosidic 
bond between C-l of MurNAc and the oxy- 
gen atom involved in the glycosidic bond. 

cn 3 

c = o 



CH 2 OH 





188 CHAPTER 6 Mechanisms of Enzymes 

The noncovalent enzyme-substrate 
complex is formed, orienting the 
substrate for reaction. Interactions 
holding the substrate in place 
include binding of the R ^ group in 
the specificity pocket (shaded). The 
binding interactions position the 
carbonyl carbon of the scissile 
peptide bond (the bond 
susceptible to cleavage) next to 
the oxygen of Ser-195. 

Binding of the substrate compresses 
Asp-102 and His-57. This strain is 
relieved by formation of a low-barrier 
hydrogen bond. The raised pK a of 
His-57 enables the imidazole ring to 
remove a proton from the hydroxyl 
group of Ser-195. The nucleophilic 
oxygen of Ser-195 attacks the 
carbonyl carbon of the peptide bond 
to form a tetrahedral intermediate 
(E-Th), which is believed to resemble 
the transition state. 

When the tetrahedral intermediate is 
formed, the substrate C — O bond 
changes from a double bond to a longer 
single bond. This allows the negatively 
charged oxygen (the oxyanion) of the 
tetrahedral intermediate to move to a E-T^ 
previously vacant position, called the 
oxyanion hole, where it can form 
hydrogen bonds with the peptide-chain 
— NH groups of Gly-193 and Ser-195. 

The imidazolium ring of His-57 acts as 
an acid catalyst, donating a proton to 
the nitrogen of the scissile peptide 
bond, thus facilitating its cleavage. 

The carbonyl group from the peptide 
forms a covalent bond with the 
enzyme, producing an acyl-enzyme Acyl E 

intermediate. After the peptide + 

product (Pt) with the new amino 
terminus leaves the active site, water 





▲ Figure 6.28 

Mechanism of chymotrypsin-catalyzed cleavage of a peptide bond. 

6.7 Lysozyme 



Carboxylate product (P 2 ) 

The carboxylate product is released from 
the active site, and free chymotrypsin is 

The second product (P 2 ) — a polypeptide 
with a new carboxy terminus — is formed. 

His-57, once again an imidazolium ion, 
donates a proton, leading to the collapse 
of the second tetrahedral intermediate. 

A second tetrahedral intermediate (E-TI 2 ) 
is formed and stabilized by the oxyanion 

Hydrolysis (deacylation) of the acyl- 
enzyme intermediate starts when 
Asp-102 and His-57 again form a low- 
barrier hydrogen bond and His-57 
removes a proton from the water 
molecule to provide an OH^group to 
attack the carbonyl group of the ester. 

▲ Figure 6.28 ( continued ) 

190 CHAPTER 6 Mechanisms of Enzymes 

▲ Figure 6.30 

Lysozyme from chicken with a pentasaccharide 
molecule (pink). The ligand is bound in sites 
A, B, C, D and E. Site F is not occupied 
in this structure. The active site for bond 
cleavage is between sites D and E. 

[PDB 1SFB]. 

(a) Chair conformation 

H | H 



ch 3 

(b) Half-chair conformation 




ch 3 

▲ Figure 6.31 

Conformations of /V-acetylmuramic acid. 

(a) Chair conformation, (b) Half-chair con- 
formation proposed for the sugar bound in 
site D of lysozyme. R represents the lactyl 
group of MurNAc. 

half-chair conformation (Figure 6.31). Two ionic amino acid residues, Glu-35 and 
Asp-52, are located close to C-l of the distorted sugar molecule in the D binding site. 
Glu-35 is in a nonpolar region of the cleft and has a perturbed piC a near 6.5. Asp-52, in 
a more polar environment, has a piC a near 3.5. The pH optimum of lysozyme is near 
5 — between these two p K a values. Recall that the piC a value of individual amino acid 
side chains may not be the same as the piC a value of the free amino acid in solution 
(Section 3.4). 

The proposed mechanism of lysozyme is shown in Figure 6.32. When a molecule of 
polysaccharide binds to lysozyme, MurNAc residues bind to sites B, D, and F (there is 
no cavity for the lactyl side chain of MurNAc in site A, C, or E). The extensive binding of 
the oligosaccharide forces the MurNAc residue in the D site into the half- chair confor- 
mation. A near covalent bond forms between Asp-52 and the postulated intermediate 
(an unstable oxocarbocation). Recent evidence suggests that this interaction might be 
more like a covalent bond than a strong ion pair but there is much controversy over this 
point. Its interesting that there are still details of the lysozyme mechanism to be worked 
out after almost 50 years of effort. 

Lysozyme is only one representative of a large group of glycoside hydrolases. Re- 
cently, the structures of a bacterial cellulase and its complexes with substrate, intermediate, 
and product have been determined. This glycosidase has a slightly different mecha- 
nism than lysozyme — it forms a covalent glycosyl-enzyme intermediate rather than 
the strong ion pair postulated for lysozyme. Other aspects of its mechanism, such 
as distortion of a sugar residue and interaction with active-site — COOH and 
— COO^ side chains, resemble those of the lysozyme mechanism. The structures 
of the enzyme complexes show that distortion of the substrate forces it toward the 
transition state. 

6.8 Arginine Kinase 

Most enzymatic reactions for which detailed mechanisms have been elucidated involve 
fairly simple reactions, such as isomerizations, cleavage reactions, or reactions with 
water as the second reactant. Therefore, in order to assess proximity effects and the ex- 
tent of transition state stabilization, it’s worthwhile looking at a more complicated reac- 
tion, such as that catalyzed by arginine kinase: 

Arginine + MgATP Arginine Phosphate + MgADP + H® 

The structure of a transition- state analog-enzyme complex of arginine kinase has 
been determined at high resolution (Figure 6.33). However, rather than studying the 
usual type of transition-state analog in which reactants are fused by covalent bonds, the 
scientists used three separate components: arginine, nitrate (to model the phosphoryl 
group transferred between arginine and ADP), and ADP. X-ray crystallographic exami- 
nation of the active site containing these three compounds led to the proposal of a 
structure for the transition state and a mechanism for the reaction (see Figure 6.33). 
The crystallographic results showed that the enzyme has greatly restricted the move- 
ment of the bound species (and presumably also of the transition state). For example, 
the terminal pyrophosphoryl group of ATP is held in place by four arginine side chains 
and a bound Mg 2+ ion and the guanidinium group of the arginine substrate molecule is 
held firmly by two glutamate side chains. The components are precisely and properly 
aligned by the enzyme. 

Arginine kinase, like other kinases, is an induced-fit enzyme (Section 6.5C). It as- 
sumes the closed shape when it is crystallized in the presence of arginine, nitrate, and 
ADP. This enzyme has a k cat of about 2 x 10 2 s -1 and K m values above 10 -4 M for both 
arginine and ATP — values that are quite typical for kinases. The movement that occurs 
during the induced-fit binding of substrates has precisely aligned the substrates, which 
had previously been bound fairly weakly, as shown by their moderate K m values. At least 
four interrelated catalytic effects participate in this enzymatic reaction: proximity 

6.8 Arginine Kinase 191 

A MurNAc residue of the 
substrate is distorted when 
it binds to the D site. 

Glu-35, which is protonated at pH 5, 
acts as an acid catalyst, donating a proton 
to the oxygen involved in the glycosidic 
bond between the the D and E residues. 

The portion of the substrate bound 

Asp-52, which is negatively 
charged at pH 5, forms a strong 
ion pair with the unstable 
oxocarbocation intermediate. 
This interaction is close to a 
covalent bond. 

A proton from the water molecule is 

▲ Figure 6.32 

Mechanism of lysozyme. Ri represents the lactyl group, and R 2 represents the A/-acetyl group of MurNAc. 

192 CHAPTER 6 Mechanisms of Enzymes 

Figure 6.33 ► 

Proposed structure of the active site of arginine 
kinase in the presence of ATP and arginine. 

The substrate molecules are held firmly and 
aligned toward the transition state, as shown 
by the dashed lines. The asterisks (*) show 
that either Glu-225 or Glu-314 could act as 
a general acid-base catalyst. 

{Adapted from Zhov, G., Somasundaram, T., Blanc, 
E., Parthasarathy, G., Ellington, W. R., and Chapman, 
M. S. (1998). Transition state structure of arginine 
kinase: implications for catalysis of bimolecular 
reactions. Proc. Natl. Acad. Sci. USA. 95:8453.) 

\ Thr-273 <-/Cys-271 

O — H -"' 1 2 3 4 S 6 0 


0-— . 




/ N 


. N - 




Arg-229N® . 



Arg-126 7 N — H 


N u 

H 2 N H / 

— N 



: O 


N— H- 





H "" 0* 


© ' H — Arg-309 

.© H 



/ °— - 





(collection and alignment of substrate molecules), fairly weak initial binding of sub- 
strates, acid-base catalysis, and transition-state stabilization (strain of substrates toward 
the shape of the transition state). 

Having gained insight into the general mechanisms of enzymes, we can now go on 
to examine reactions that include coenzymes. These reactions require groups not sup- 
plied by the side chains of amino acids. 


1. The four major modes of enzymatic catalysis are acid-base catalysis 
and covalent catalysis (chemical modes) and proximity and tran- 
sition-state stabilization (binding modes). The atomic details of 
reactions are described by reaction mechanisms, which are based 
on the analysis of kinetic experiments and protein structures. 

2. For each step in a reaction, the reactants pass through a transition 
state. The energy difference between stable reactants and the tran- 
sition state is the activation energy. Catalysts allow faster reactions 
by lowering the activation energy. 

3. Ionizable amino acid residues in active sites form catalytic cen- 
ters. These residues may participate in acid-base catalysis (proton 
addition or removal) or covalent catalysis (covalent attachment of 
a portion of the substrate to the enzyme). The effects of pH on 
the rate of an enzymatic reaction can suggest which residues par- 
ticipate in catalysis. 

4. The catalytic rates for a few enzymes are so high that they ap- 
proach the upper physical limit of reactions in solution, the rate 
at which reactants approach each other by diffusion. 

5. Most of the rate acceleration achieved by an enzyme arises from 
the binding of substrates to the enzyme. 

6. The proximity effect is acceleration of the reaction rate due to the 
formation of a noncovalent ES complex that collects and orients 
reactants resulting in a decrease in entropy. 

7. An enzyme binds its substrates fairly weakly. Excessively strong 
binding would stabilize the ES complex and slow the reaction. 

8. An enzyme binds a transition state with greater affinity than it 
binds substrates. Evidence for transition state stabilization is pro- 
vided by transition-state analogs that are enzyme inhibitors. 

9. Some enzymes use induced fit (substrate-induced activation that 
involves a conformation change) to prevent wasteful hydrolysis of 
a reactive substrate. 

10. Many serine proteases are synthesized as inactive zymogens that 
are activated extracellularly under appropriate conditions by se- 
lective proteolysis. The examination of serine proteases by X-ray 
crystallography shows how the three-dimensional structures of 
proteins can reveal information about the active sites, including 
the binding of specific substrates. 

11. The active sites of serine proteases contain a hydrogen-bonded 
Ser-His-Asp catalytic triad. The serine residue serves as a cova- 
lent catalyst, and the histidine residue serves as an acid-base cata- 
lyst. Anionic tetrahedral intermediates are stabilized by hydrogen 
bonds with the enzyme. 

12. The proposed mechanism for lysozyme, an enzyme that catalyzes 
the hydrolysis of bacterial cell walls, includes substrate distortion 
and stabilization of an unstable oxocarbocation intermediate. 

Problems 193 


1. (a) What forces are involved in binding substrates and interme- 

diates to the active sites of enzymes? 

(b) Explain why very tight binding of a substrate to an enzyme is 
not desirable for enzyme catalysis, whereas tight binding of 
the transition state is desirable. 

2. The enzyme orotodine 5-phosphate decarboxylase is one of the 
most proficient enzymes known, accelerating the rate of decarboxy- 
lation of orotidine 5' monophosphate by a factor of 10 23 (Section 
5.4). Nitrogen- 15 isotope effect studies have shown that two major 
participating mechanisms are (1) destabilization of the ground state 
ES complex by electrostatic repulsion between the enzyme and sub- 
strate, and (2) stabilization of the transition state by favorable elec- 
trostatic interactions between the enzyme and ES*. Draw an energy 
diagram that shows how these two effects promote catalysis. 

3. The energy diagrams for two multistep reactions are shown below. 
What is the rate- determining step in each of these reactions? 

4. Reaction 2 below occurs 2.5 X 10 11 times faster than Reaction 1. 
What is likely to be a major reason for this enormous rate increase 
in Reaction 2? How is this model relevant for interpreting possi- 
ble mechanisms for enzyme rate increases? 


5. List three major catalytic effects for lysozyme and explain how 
each is used during the enzyme- catalyzed hydrolysis of a glyco- 
sidic bond. 

6. There are multiple serine residues in a- chymotrypsin but only ser- 
ine 195 reacts rapidly when the enzyme is treated with active phos- 
phate inhibitors such as diisopropyl fluorophosphate (DFP). Explain. 

7. (a) Identify the residues in the catalytic triad of a-chymotrypsin 

and indicate the type of catalysis mediated by each residue. 

(b) What additional amino acid groups are found in the oxy an- 
ion hole and what role do they play in catalysis? 

(c) Explain why site-directed mutagenesis of aspartate to as- 
paragine in the active site of trypsin decreases the catalytic 
activity 10,000-fold. 

8. Catalytic triad groupings of amino acid residues increase the nu- 
cleophilic character of active-site serine, threonine, or cysteine 
residues present in many enzymes involved in catalyzing the cleav- 
age of substrate amide or ester bonds. Using a- chymotrypsin as a 
model system, diagram the expected arrangements of the catalytic 
triads in the enzymes below. 

(a) Human cytomegalovirus protease: His, His, Ser 

(b) /3- lactamase: Glu, Lys, Ser 

(c) Asparaginase: Asp, Lys, Thr 

(d) Hepatitis A protease: Asp, (H 2 0), His, Cys (a water molecule 
is situated between the Asp and His residues) 

9 . Human dipeptidyl peptidase IV (DDP-IV) is a serine protease 
that catalyzes hydrolysis of prolyl peptide bonds at the next- 
to-last position at the N terminus of a protein. Many physiologi- 
cal peptides have been identified as substrates, including proteins 
involved in the regulation of glucose metabolism. DDP-IV con- 
tains a catalytic triad at the active site (Glu-His-Ser) and a tyrosine 
residue in the oxyanion hole. Site-directed mutagenesis of this 
tyrosine residue in DPP-IV was performed, and the ability of 
the enzyme to cleave a peptide substrate was compared to that of the 
wild-type enzyme. The tyrosine residue found in the oxyanion 
hole was changed to a phenylalanine. The phenylalanine mutant 
had less than 1% of the activity of the wild-type enzyme (Bjelke, 
J. R., Christensen, J., Branner, S., Wagtmann, N., Olsen, C. 
Kanstrup, A. B., and Rasmussen, H. B. (2004). Tyrosine 547 con- 
stitutes an essential part of the catalytic mechanism of dipeptidyl 
peptidase IV. /. Biol Chem. 279:34691-34697). Is this tyrosine 
required for activity of DDP-IV? Why does the replacement of a 
tyrosine with a phenylalanine abolish the enzyme activity? 

10 . Acetylcholinesterase (AChE) catalyzes the breakdown of the neu- 
rotransmitter acetylcholine to acetate and choline. This enzyme 
contains a catalytic triad with the residues His, Glu, and Ser. The 
catalytic triad enhances the nucleophilicity of the serine residue. 
The nucleophilic oxygen of serine attacks the carbonyl carbon of 
acetylcholine to form a tetrahedral intermediate. 



H 3 CA ^Cr 


(CH 2 ) 2 + h 2 o 

N©(CH 3 ) 3 

AChE ? 



h 3 c^coo 0 

^(CH 2 ) 2 

HO— CH 2 ""N© 

The nerve agent sarin is an extremely potent inactivator of AChE. 
Sarin is an irreversible inhibitor that covalently modifies the ser- 
ine residue in the active site of AChE. 



/ \ 




(a) Diagram the expected arrangement of the amino acids in the 
catalytic triad. 

(b) Propose a mechanism for the covalent modification of AChE 
by sarin. 

194 CHAPTER 6 Mechanisms of Enzymes 

11. Catalytic antibodies are potential therapeutic agents for drug 
overdose and addiction. For example, a catalytic antibody that 
catalyzes the breakdown of cocaine before it reached the brain 
would be an effective detoxification treatment for drug abuse and 
addiction. The phosphonate analog below was used to raise an 
anticocaine antibody that catalyzes the rapid hydrolysis of co- 
caine. Explain why this phosphonate ester was chosen to produce 
a catalytic antibody. 

Phosphonate analog 


(-) - Cocaine 

Ecgonine Benzoic acid 

methyl ester 

12. In the chronic lung disease emphysema, the lung s air sacs (alve- 
oli), where oxygen from the air is exchanged for carbon dioxide in 
the blood, degenerate. a \ -Proteinase inhibitor deficiency is a 
genetic condition that runs in certain families and results from 
mutations in critical amino acids in the sequence of a 1 -proteinase 
inhibitor. The individuals with mutations are more likely to de- 
velop emphysema, a 1 -Proteinase inhibitor is produced by the 
liver and then circulates in the blood, al -Proteinase inhibitor is a 
protein that serves as the major inhibitor of neutrophil elastase, 
a serine protease present in the lung. Neutrophil elastase cleaves 
the protein elastin, which is an important component for lung 
function. The increased rate of elastin breakdown in lung tissue is 
believed to cause emphysema. One treatment for a 1 -proteinase 
inhibitor deficiency is to give the patient human wild-type 
a 1 -proteinase inhibitor (derived from large pools of human 
plasma) intravenously by injecting the protein directly into the 

(a) Explain the rational for the treatment with wild-type 
a 1 -proteinase inhibitor. 

(b) This treatment involves the intravenous administration 
of the wild- type a 1 -proteinase inhibitor. Explain why 
a 1 -proteinase inhibitor cannot be taken orally. 

Selected Readings 


Fersht, A. (1985). Enzyme Structure and Mechanism , 
2nd ed. (New York: W. H. Freeman). 

Binding and Catalysis 

Bartlett, G. J., Porter, C. T., Borkakoti, N. and 
Thornton, J. M. (2002). Analysis of catalytic 
residues in enzyme active sites. /. Mol. Biol. 

Bruice, T. C. and Pandrit, U. K. (1960). Intramole- 
cular models depicting the kinetic importance of 
“fit” in enzymatic catalysis. Proc. Natl. Acad. Sci. 
USA. 46:402-404. 

Hackney, D. D. (1990). Binding energy and catalysis. 
In The Enzymes , Vol. 19, 3rd ed., D. S. Sigman and P. 
D. Boyer, eds. (San Diego: Academic Press), pp. 1-36. 

Jencks, W. P. (1987). Economics of enzyme catalysis. 
Cold Spring Harbor Symp. Quant. Biol. 52:65-73. 

Kraut, J. (1988). How do enzymes work? Science 

Neet, K. E. (1998). Enzyme catalytic power mini- 
review series./. Biol. Chem. 273:25527-25528, and 
related papers on pages 25529-25532, 26257-26260, 
and 27035-27038. 

Pauling, L. (1948) Nature of forces between large 
molecules of biological interest. Nature 

Schiott, B., Iversen, B. B., Madsen, G. K. H., Larsen, 
F. K., and Bruice, T. C. (1998). On the electronic 
nature of low-barrier hydrogen bonds in 
enzymatic reactions. Proc. Natl. Acad. Sci. USA 

Shan, S.-U., and Herschlag, D. (1996). The change 
in hydrogen bond strength accompanying charge 
rearrangement: implications for enzymatic cataly- 
sis. Proc. Natl. Acad. Sci. USA 93:14474-14479. 

Transition-State Analogs 

Schramm, V. L. (1998). Enzymatic transition states 
and transition state analog design. Annu. Rev. 
Biochem. 67:693-720. 

Wolfenden, R., and Radzicka, A. (1991). Transi- 
tion-state analogues. Curr. Opin. Struct. Biol. 

Specific Enzymes 

Cassidy, C. S., Lin, J., and Frey, P. A. (1997). A new 
concept for the mechanism of action of chymo- 
typsin: the role of the low-barrier hydrogen bond. 
Biochem. 36:4576-4584. 

Blacklow, S. C., Raines, R. T., Lim, W. A., Zamore, 
P. D., and Lnowles, J. R. (1988). Triosephosphate 
isomerase catalysis is diffusion controlled. 
Biochem. 27:1158-1167. 

Selected Readings 195 

Davies, G. J., Mackenzie, L., Varrot, A., Dauter, M., 
Brzozowski, A. M., Schiilein, M., and Withers, S. G. 
(1998). Snapshots along an enzymatic reaction 
coordinate: analysis of a retaining (3 -glycoside 
hydrolase. Biochem. 37:11707-11713. 

Dodson, G., and Wlodawer, A. (1998). Catalytic 
triads and their relatives. Trends Biochem. Sci. 

Frey, P. A., Whitt, S. A., and Tobin, J. B. (1994). A 
low-barrier hydrogen bond in the catalytic triad of 
serine proteases. Science. 264:1927-1930. 

Getzoff, E. D., Cabelli, D. E., Fisher, C. L., Parge, 

H. E., Viezzoli, M. S., Banci, L., and Hallewell, R. A. 
(1992). Faster superoxide dismutase mutants de- 
signed by enhancing electrostatic guidance. Nature. 

Harris, T. K., Abeygunawardana, C., and Mildvan, 
A. S. (1997). NMR studies of the role of hydrogen 
bonding in the mechanism of triosephosphate iso- 
merase. Biochem. 36:14661-14675. 

Huber, R., and Bode, W. (1978). Structural basis of 
the activation and action of trypsin. Ace. Chem. Res. 

Kinoshita, T., Nishio, N., Nakanishi, I., Sato, A., 
and Fujii, T. (2003). Structure of bovine adeno- 
sine deaminase complexed with 6-hydroxy- 1,6- 
dihydropurine riboside. Acta Cryst. D59:299-303. 

Kirby, A. J. (2001). The lysozyme mechanism sorted — 
after 50 years. Nature Struct. Biol. 8:737-739. 

Knolwes, J. R. (1991) Enzyme catalysis: not differ- 
ent, just better. Nature. 350:121-124. 

Knowles, J. R., and Albery, W. J. (1977). Perfection 
in enzyme catalysis: the energetics of triosephos- 
phate isomerase. Ace. Chem. Res. 10:105-111. 

Kuser, P., Cupri, F., Bleicher, L., and Polikarpov, I. 
(2008). Crystal structure of yeast hexokinase PI in 
complex with glucose: a classical “induced fit” ex- 
ample revisited. Proteins. 72:731-740. 

Lin, J., Cassidy, C. S., and Frey, P. A. (1998). Corre- 
lations of the basicity of His-57 with transition 
state analogue binding, substrate reactivity, and 
the strength of the low-barrier hydrogen bond in 
chymotrypsin. Biochem. 37:11940-11948. 

Lodi, P. J., and Knowles, J. R. (1991). Neutral 
imidazole is the electrophile in the reaction cat- 
alyzed by triosephosphate isomerase: structural 
origins and catalytic implications. Biochem. 

Parthasarathy, S., Ravinda, G., Balaram, H., 
Balaram, P., and Murthy, M. R. N. (2002). Struc- 
ture of the plasmodium falciparum triosephos- 
phate isomerase — phosphoglycolate complex in 
two crystal forms: characterization of catalytic 
open and closed conformations in the ligand- 
bound state. Biochem. 41:13178-13188. 

Paetzel, M., and Dalbey, R. E. (1997). Catalytic 
hydroxyl/amine dyads within serine proteases. 
Trends Biochem. Sci. 22:28-31. 

Perona, J. J., and Craik, C. S. (1997). Evolutionary 
divergence of substrate specificity within the 
chymotrypsin-like serine protease fold. /. Biol. 
Chem. 272:29987-29990. 

Schafer T., Borchert T. W., Nielsen V. S., Skager- 
lind P., Gibson K., Wenger K., Hatzack F., Nilsson 
L. D., Salmon S., Pedersen S., Heldt-Hansen H. P., 
Poulsen P. B., Lund H., Oxenboll K. M., Wu, 

G. F., Pedersen H. H., Xu, H. (2007). Industrial 
enzymes. Adv. Biochem. Eng. Biotechnol. 2007 

Steitz, T. A., and Shulman, R. G. (1982). Crystallo- 
graphic and NMR studies of the serine proteases. 
Annu. Rev. Biophys. Bioeng. 11:419-444. 

Von Dreele, R. B. (2005). Binding of N-acetylglu- 
cosamine oligosaccharides to hen egg-white 
lysozyme: a powder diffraction study. Acta 
Crystallographic. D6 1:22-32. 

Zhou, G., Somasundaram, T., Blanc, E., Parthasarathy, 
G., Ellington, W. R., and Chapman, M. S. (1998). 
Transition state structure of arginine kinase: im- 
plications for catalysis of bimolecular reactions. 
Proc. Natl. Acad. Sci. USA 95:8449-8454. 









o c 










_ o 

° o o o 

° o 


o o 


° c 



o o 

Coenzymes and Vitamins 

E volution has produced a spectacular array of protein catalysts but the catalytic 
repertoire of an organism is not limited by the reactivity of amino acid side 
chains. Other chemical species, called cofactors, often participate in catalysis. 
Cofactors are required by inactive apoenzymes (proteins only) to convert them to active 
holoenzymes. There are two types of cofactors: essential ions (mostly metal ions) and 
organic compounds known as coenzymes (Figure 7.1). Both inorganic and organic co- 
factors become essential portions of the active sites of certain enzymes. 

Many of the minerals required by all organisms are essential because they are cofac- 
tors. Some essential ions, called activator ions , are reversibly bound and often participate 
in the binding of substrates. In contrast, some cations are tightly bound and frequently 
participate directly in catalytic reactions. 

Coenzymes act as group -transfer reagents. They accept and donate specific chemi- 
cal groups. For some coenzymes, the group is simply hydrogen or an electron but other 
coenzymes carry larger, covalently attached chemical groups. These mobile metabolic 
groups are attached at the reactive center of the coenzyme. (Either the mobile metabolic 
group or the reactive center is shown in red in the structures presented in this chapter.) 
We can simplify our study of coenzymes by focusing on the chemical properties of their 
reactive centers. The two classes of coenzymes are described in Section 7.2. 

We begin this chapter with a discussion of essentialion cofactors. Much of the rest 
of the chapter is devoted to the more complex organic cofactors. In mammals, many of 
these coenzymes are derived from dietary precursors called vitamins. We therefore dis- 
cuss vitamins in this chapter. We conclude with a look at a few proteins that are coen- 
zymes. Most of the structures and reactions presented here will be encountered in later 
chapters when we discuss particular metabolic pathways. 


Essential ions Coenzymes 

Activator ions Metal ions of Cosubstrates Prosthetic groups 

(loosely bound) metalloenzymes (loosely bound) (tightly bound) 
(tightly bound) 

Finally, we come to a group of com- 
pounds which have only been known 
for a relatively short time ; but which 
during this short time have attracted 
very considerable attention ; both from 
chemists and from the public at 
large. Who today is unacquainted 
with vitamins, these mysterious sub- 
stances which are of such immense 
significance for life, vita, itself and 
which have thus justifiably taken 
their name from it? 

— H.G. Soderbaum Presentation 
speech for the Nobel Prize in 
chemistry to Adolf Windaus, 1 928 

◄ Figure 7.1 

Types of cofactors. Essential ions and coen- 
zymes can be further distinguished by the 
strength of interaction with their apoenzymes. 

Top: Nicotinamide adenine dinucleotide (NAD®), a coenzyme derived from the vitamin nicotinic acid (niacin). NAD® is 
an oxidizing agent. 


7.2 Coenzyme Classification 197 

7.1 Many Enzymes Require Inorganic Cations 

Over a quarter of all known enzymes require metallic cations to achieve full catalytic ac- 
tivity. These enzymes can be divided into two groups: metal-activated enzymes and 
metalloenzymes. Metal-activated enzymes either have an absolute requirement for added 
metal ions or are stimulated by the addition of metal ions. Some of these enzymes re- 
quire monovalent cations such as K® and others require divalent cations such as Ca® or 
Mg®. Kinases, for example, require magnesium ions for the magnesium- ATP complex 
they use as a phosphoryl group donating substrate. Magnesium shields the negatively 
charged phosphate groups of ATP making them more susceptible to nucleophilic attack 
(Section 10.6). 

Metalloenzymes contain firmly bound metal ions at their active sites. The ions 
most commonly found in metalloenzymes are the transition metals, iron and zinc, and 
less often, copper and cobalt. Metal ions that bind tightly to enzymes are usually re- 
quired for catalysis. The cations of some metalloenzymes can act as electrophilic cata- 
lysts by polarizing bonds. For example, the cofactor for the enzyme carbonic anhydrase 
is an electrophilic zinc atom bound to the side chains of three histidine residues and to 
a molecule of water. Binding to Zn® causes the water to ionize more readily. A basic 
carboxylate group of the enzyme removes a proton from the bound water molecule, 
producing a nucleophilic hydroxide ion that attacks the substrate (Figure 7.2). This en- 
zyme has a very high catalytic rate partly because of the simplicity of its mechanism 
(Section 6.4). Many other zinc metalloenzymes activate bound water molecules in this 

The ions of other metalloenzymes can undergo reversible oxidation and reduction 
by transferring electrons from a reduced substrate to an oxidized substrate. For example, 
iron is part of the heme group of catalase, an enzyme that catalyzes the degradation of 
H 2 0 2 . Similar heme groups also occur in cytochromes, electron-transferring proteins 
found associated with specific metalloenzymes in mitochondria and chloroplasts. Non- 
heme iron is often found in metalloenzymes in the form of iron-sulfur clusters (Figure 7.3). 
The most common iron-sulfur clusters are the [2 Fe-2 S] and [4 Fe-4 S] clusters in 
which the iron atoms are complexed with an equal number of sulfide ions from H 2 S 
and — S® groups from cysteine residues. Iron-sulfur clusters mediate some oxidation- 
reduction reactions. Each cluster, whether it contains two or four iron atoms, can accept 
only one electron in an oxidation reaction. 

7.2 Coenzyme Classification 

Coenzymes can be classified into two types based on how they interact with the apoen- 
zyme (Figure 7.1). Coenzymes of one type — often called cosubstrates — are actually sub- 
strates in enzyme-catalyzed reactions. A cosubstrate is altered in the course of the reac- 
tion and dissociates from the active site. The original structure of the cosubstrate is 
regenerated in a subsequent reaction catalyzed by another enzyme. The cosubstrate is 
recycled repeatedly within the cell, unlike an ordinary substrate whose product typically 
undergoes further transformation. Cosubstrates shuttle mobile metabolic groups 
among different enzyme -catalyzed reactions. 

The second type of coenzyme is called a prosthetic group. A prosthetic group re- 
mains bound to the enzyme during the course of the reaction. In some cases the pros- 
thetic group is covalently attached to its apoenzyme, while in other cases it is tightly 
bound to the active site by many weak interactions. Like the ionic amino acid residues 
of the active site, a prosthetic group must return to its original form during each full 
catalytic event or the holoenzyme will not remain catalytically active. Cosubstrates and 
prosthetic groups are part of the active site of enzymes. They supply reactive groups 
that are not available on the side chains of amino acid residues. 

Every living species uses coenzymes in a diverse number of important enzyme- 
catalyzed reactions. Most of these species are capable of synthesizing their coenzymes 
from simple precursors. This is especially true in four of the five kingdoms — prokary- 
otes, protists, fungi, and plants — but animals have lost the ability to synthesize some 

Refer to Figure 1.1 for a table of the 
essential elements. 


co 2 -^ ^co 2 


h 2 o^ ^h 2 o 

▲ Figure 7.2 

Mechanism of carbonic anhydrase. The zinc 
ion in the active site promotes the ionization 
of a bound water molecule. The resulting 
hydroxide ion attacks the carbon atom of 
carbon dioxide, producing bicarbonate, 
which is released from the enzyme. 

Review Section 4.12 for the structure 
of heme. 

Cytochromes will be discussed in 
Section 7.16. 

198 CHAPTER 7 Coenzymes and Vitamins 

▲ Figure 7.3 

Iron-sulfur clusters. In each type of iron- 
sulfur cluster, the iron atoms are complexed 
with an equal number of sulfide ions (S 2- ) 
and with the thiolate groups of the side 
chains of cysteine residues. 

Table 7.1 Some vitamins and their 

associated deficiency diseases 



Ascorbate (C) 


Thiamine (B-|) 


Riboflavin (B 2 ) 

Growth retardation 

Nicotinic acid (B 3 ) Pellagra 

Pantothenate (B 5 ) 

Dermatitis in chickens 

Pyridoxal (B 6 ) 

Dermatitis in rats 

Biotin (B 7 ) 
Folate (B 9 ) 

Dermatitis in humans 

Cobalamin (B 12 ) 

Pernicious anemia 

The structure and chemistry of 
nucleotides is discussed in more detail 
in Chapter 19. 

coenzymes. Mammals (including humans) need a source of coenzymes in order to sur- 
vive. The ones they cant synthesize are supplied by nutrients, usually in small amounts 
(micrograms or milligrams per day). These essential compounds are called vitamins and 
animals rely on other organisms to supply these micronutrients. The ultimate sources 
of vitamins are usually plants and microorganisms. Most vitamins are coenzyme 
precursors — they must be enzymatically transformed to their corresponding coenzymes. 

A vitamin-deficiency disease can result when a vitamin is deficient or absent in the 
diet of an animal. Such diseases can be overcome or prevented by consuming the appro- 
priate vitamin. Table 7. 1 lists nine vitamins and the diseases associated with their defi- 
ciencies. Each of these vitamins and their metabolic roles are discussed below. Most of 
them are converted to coenzymes, sometimes after a reaction with ATP. 

The word vitamin (originally spelled “vitamine”) was coined by Casimir Funk in 
1912 to describe a “vital amine” from brown rice that cured beriberi, a nutritional-defi- 
ciency disease that results in neural degeneration. The term vitamin has been retained 
even though many vitamins proved not to be amines. Beriberi was first described in 
birds and then in humans whose diets consisted largely of polished rice. Christiaan Eijk- 
man, a Dutch physician working in what was then the Dutch East Indies (now Indone- 
sia), was the first to notice that chickens fed polished rice leftover from the local hospital 
developed beriberi but they recovered when they were fed brown rice. This discovery 
led eventually to isolation of an antiberiberi substance from the skin that covers brown 
rice. This substance became known as vitamin B x (thiamine). 

Two broad classes of vitamins have since been identified: water-soluble (such as B 
vitamins) and fat-soluble (also called lipid vitamins). Water-soluble vitamins are 
required daily in small amounts because they are readily excreted in the urine and the 
cellular stores of their coenzymes are not stable. Conversely, lipid vitamins such as vita- 
mins A, D, E, and K, are stored by animals and excessive intakes can result in toxic con- 
ditions known as hypervitaminoses. It’s important to note that not all vitamins are 
coenzymes or their precursors (see Box 7.4 and Section 7.14). 

The most common coenzymes are listed in Table 7.2 along with their metabolic 
role and their vitamin source. The following sections describe the structures and func- 
tions of these common coenzymes. 

7.3 ATP and Other Nucleotide Cosubstrates 

A number of nucleosides and nucleotides are coenzymes. Adenosine triphosphate (ATP) is 
by far the most abundant. Other common examples are GTP, S-adenosylmethionine, and 
nucleotide sugars such as uridine diphosphate glucose (UDP-glucose). ATP (Figure 7.4) 
is a versatile reactant that can donate its phosphoryl, pyrophosphoryl, adenylyl (AMP), 
or adenosyl groups in group -transfer reactions. 

The most common reaction involving ATP is phosphoryl group transfer. In reac- 
tions catalyzed by kinases, for example, the y-phosphoryl group of ATP is transferred to 
a nucleophile leaving ADP. The second most common reaction is nucleotidyl group 
transfer (transfer of the AMP moiety) leaving pyrophosphate (PPj). ATP plays a central 
role in metabolism. Its role as a “high energy” cofactor is described in more detail in 
Chapter 10, “Introduction to Metabolism.” 

ATP is also the source of several other metabolite coenzymes. One, S-adenosylme- 
thionine (Figure 7.5), is synthesized by the reaction of methionine with ATP. 

Methionine + ATP > S-Adenosylmethionine + Pj + PPj (7.1) 

The normal thiomethyl group of methionine ( — S — CH 3 ) is not very reactive but the posi- 
tively charged sulfonium of 5 - adenosylmethionine is highly reactive. S-adenosylmethionine 

◄ Brown rice and white rice. Brown rice (top left) has been processed to remove the outer husks but it 
retains part of the outer skin or “bran.” This skin contains thiamine (vitamin B^. Further processing 
of the grain yields white rice (middle left), which lacks thiamine. 

7.3 ATP and Other Nucleotide Cosubstrates 199 

Table 7.2 Major coenzymes 


Vitamin source 

Major metabolic roles 

Mechanistic role 

Adenosine triphosphate (ATP) 


Transfer of phosphoryl or 
nucleotidyl groups 




Transfer of methyl groups 


Uridine diphosphate glucose 


Transfer of glycosyl groups 


Nicotinamide adenine dinucleotide (NAD®) 
and nicotinamide adenine dinucleotide 
phosphate (NADP®) 

Niacin (B 3 ) 

Oxidation-reduction reactions 
involving two-electron transfer 


Flavin mononucleotide (FMN) and flavin 
adenine dinucleotide (FAD) 

Riboflavin (B 2 ) 

Oxidation-reduction reactions involving 
one- and two-electron transfers 

Prosthetic group 

Coenzyme A (CoA) 

Pantothenate (B 5 ) 

Transfer of acyl groups 


Thiamine pyrophosphate (TPP) 

Thiamine (B^ 

Transfer of multi-carbon fragments contain- 
ing a carbonyl group 

Prosthetic group 

Pyridoxal phosphate (PLP) 

Pyridoxine (B 6 ) 

Transfer of groups to and from amino acids 

Prosthetic group 


Biotin (B 7 ) 

ATP-dependent carboxylation of substrates or 
carboxyl-group transfer between substrates 

Prosthetic group 



Transfer of one-carbon substituents, especially 
formyl and hydroxymethyl groups; provides 
the methyl group for thymine in DNA 



Cobalamin (B 12 ) 

Intramolecular rearrangements, 
transfer of methyl groups. 

Prosthetic group 



Oxidation of a hydroxyalkyl group from TPP 
and subsequent transfer as an acyl group 

Prosthetic group 


Vitamin A 


Prosthetic group 

Vitamin K 

Vitamin K 

Carboxylation of some glutamate residues 

Prosthetic group 

Ubiquinone (Q) 


Lipid-soluble electron carrier 


Heme Group 


Electron transfer 

Prosthetic group 

reacts readily with nucleophilic acceptors and is the donor of almost all the methyl y^g thermodynamics of reactions involv- 

groups used in biosynthetic reactions. For example, it is required for conversion of the j n g ^yp j s explained in Section 10.6. 

hormone norepinephrine to epinephrine. 



Epinephrine (7.2) 

O O 

0 O— P— O— P — 





▲ Figure 7.4 

ATP. The nitrogenous base adenine is linked to a ribose bearing three phosphoryl groups. Transfer of 
a phosphoryl group (red) generates ADP, and transfer of a nucleotidyl group (AMP, blue) generates 

▲ Figure 7.5 

S-Adenosylmethionine. The activated methyl 
group of this coenzyme is shown in red. 

200 CHAPTER 7 Coenzymes and Vitamins 


Whatever happened to vitamin B 4 and vitamin B 8 ? They are 
never listed in the textbooks but you’ll often find them sold 
in stores that cater to the demand for supplements that might 
make you feel better and live longer. 

Vitamin B 4 was adenine, the base found in DNA and 
RNA. We now know that it’s not a vitamin. All species, in- 
cluding humans, can make copious quantities of adenine 
whenever it’s needed (Sections 18.1 and 18.2). Vitamin B 8 
was inositol, a precursor of several important lipids 
(Figure 8.16 and Section 9.12C). It’s no longer considered a 

If you know anyone who is paying money for vitamin B 4 
and B 8 supplements then here’s your chance to be helpful. 
Tell them why they’re wasting their money. 

▲ P.T. Barnum. P.T. Barnum was a famous American showman. 
He’s credited with saying, “There’s a sucker born every minute.” 
It’s likely that the memorable phrase was coined by one of his 
rivals and later attributed to Barnum in order to discredit him. 

Methylation reactions that require S-adenosylmethionine include methylation of phos- 
pholipids, proteins, DNA, and RNA. In plants, S-adenosylmethionine — as a precursor 
of the plant hormone ethylene — is involved in regulating the ripening of fruit. 

Nucleotide-sugar coenzymes are involved in carbohydrate metabolism. The most 
common nucleotide sugar, uridine diphosphate glucose (UDP-glucose), is formed by 
the reaction of glucose 1 -phosphate with uridine triphosphate (UTP) (Figure 7.6 ). 
UDP-glucose can donate its glycosyl group (shown in red) to a suitable acceptor, releas- 
ing UDP. UDP-glucose is regenerated when UDP accepts a phosphoryl group from ATP 
and the resulting UTP reacts with another molecule of glucose 1 -phosphate. 

Both the sugar and the nucleoside of nucleotide-sugar coenzymes may vary. Later 
on, we will encounter CDP, GDP, and ADP variants of this coenzyme. 

7.4 NAD© and NADP© 

The nicotinamide coenzymes are nicotinamide adenine dinucleotide (NAD®) and the 
closely related nicotinamide adenine dinucleotide phosphate (NADP®). These were the 
first coenzymes to be recognized. Both contain nicotinamide, the amide of nicotinic 
acid (Figure 7.7 ). Nicotinic acid (also called niacin) is the factor missing in the disease 
pellagra. Nicotinic acid or nicotinamide is essential as a precursor of NAD® and 
NADP®. (In many species, tryptophan is degraded to nicotinic acid. Dietary trypto- 
phan can therefore spare some of the requirement for niacin or nicotinamide.) 

The nicotinamide coenzymes play a role in many oxidation-reduction reactions. 
They assist in the transfer of electrons to and from metabolites (Section 10.9). The oxi- 
dized forms, NAD® and NADP®, are electron deficient and the reduced forms, NADH 
and NADPH, carry an extra pair of electrons in the form of a covalently bound hydride 
ion. The structures of these coenzymes are shown in Figure 7.8 . Both coenzymes con- 
tain a phosphoanhydride linkage that joins two 5' -nucleotides: AMP and the ribonu- 
cleotide of nicotinamide, called nicotinamide mononucleotide (NMN) (formed from 
nicotinic acid). In the case of NADP®, a phosphoryl group is present on the 2 '-oxygen 
atom of the adenylate moiety. 

Note that the ® sign in NAD® simply indicates that the nitrogen atom carries a 
positive charge. This does not mean that the entire molecule is a positively charged ion; 
in fact, it is negatively charged due to the phosphates. A nitrogen atom normally has 

7.4 NAD© and NADP© 201 

a-D-Glucose 1 -phosphate 

◄ Figure 7.6 

Formation of UDP-glucose catalyzed by UDP- 
glucose pyrophosphorylase. An oxygen of the 
phosphate group of a-D-glucose 1-phosphate 
attacks the a-phosphorus of UTP. The PPj 
released is rapidly hydrolyzed to 2Pj by the 
action of pyrophosphatase. This hydrolysis 
helps drive the pyrophosphorylase-catalyzed 
reaction toward completion. The mobile gly- 
cosyl group of UDP-glucose is shown in red. 

seven protons and seven electrons. The outer shell has five electrons that can participate 
in bond formation. In the oxidized form of the coenzyme (NAD® and NADP®) 
the nicotinamide nitrogen is missing one of its electrons. It has only four electrons in 
the outer shell and those are shared with adjacent carbon atoms to form a total of four 
covalent bonds. (Each bond has a pair of electrons so the outer shell of the nitrogen 
atom is filled with eight shared electrons.) This is why we normally associate the posi- 
tive charge with the ring nitrogen atom as shown in Figure 7.8. In fact, the charge is 
distributed over the entire aromatic ring. 

The reduced form of the nitrogen atom has its normal, full complement of elec- 
trons. In particular, the nitrogen atom has five electrons in its outer shell. Two of these 
electrons (represented by dots in Figure 7.8) are a free pair of electrons. The other three 
electrons participate in three covalent bonds. 

NAD® and NADP® almost always act as cosubstrates for dehydrogenases. Pyri- 
dine nucleotide-dependent dehydrogenases catalyze the oxidation of their substrates by 
transferring two electrons and a proton in the form of a hydride ion (H®) to C-4 of the 
nicotinamide group of NAD® or NADP®. This generates the reduced form, NADH or 
NADPH, where a new C — H bond has formed at C-4 (one pair of electrons) and the 
electron previously associated with the ring double bond has delocalized to the ring ni- 
trogen atom. Thus, oxidation by pyridine nucleotides (or reduction, the reverse reac- 
tion) always occurs two electrons at a time. 

NADH and NADPH are said to possess reducing power (i.e., they are biological 
reducing agents). The stability of reduced pyridine nucleotides allows them to carry 
their reducing power from one enzyme to another, a property not shared by flavin 


Nicotinic acid 


^nh 2 


▲ Figure 7.7 

Nicotinic acid (niacin) and nicotinamide. 

NADH and NADPH exhibit a peak of 
ultraviolet absorbance at 340 nm due 
to the dihydropyridine ring, whereas 
NAD® and NADP® do not absorb light 
at this wavelength. The appearance 
and disappearance of absorbance at 
340 nm are useful for measuring the 
rates of oxidation and reduction reac- 
tions if they involve NAD® or NADP®. 
(see Box 10.1). 


CHAPTER 7 Coenzymes and Vitamins 

Oxidized form 

Reduced form 







H O 

H H O 



▲ Figure 7.8 

Oxidized and reduced forms of NAD (and 
NADP). The pyridine ring of NAD© is re- 
duced by the addition of a hydride ion to C-4 
when NAD© is converted to NADH (and when 
NADP© is converted to NADPH). In NADP©, 
the 2'-hydroxyl group of the sugar ring of 
adenosine is phosphorylated. The reactive 
center of these coenzymes is shown in red. 

coenzymes (Section 7.5). Most reactions forming NADH and NADPH are catabolic re- 
actions and the subsequent oxidation of NADH by the membrane- associated electron 
transport system is coupled to the synthesis of ATP. Most NADPH is used as a reducing 
agent in biosynthetic reactions. The concentration of NADH is about ten times higher 
than that of NADPH. 

Lactate dehydrogenase is an oxidoreductase that catalyzes the reversible oxidation 
of lactate. The enzyme is a typical NAD-dependent dehydrogenase. A proton is released 
from lactate when NAD® is reduced. 


H 3 c — CH— COO© + NAD® H 3 C— C —COO© + NADH + H® 

Lactate Pyruvate (7.3) 

NADH is a cosubtrate, like ATP. When the reaction is complete, the structure of the co- 
substrate is altered and the original form must be regenerated in a separate reaction. In 
this example, NAD® is reduced to NADH and the reaction will soon reach equilibrium 
unless NADH is used up in a separate reaction where NAD® is regenerated. We de- 
scribe one example of how this is accomplished in Section 11. 3B. 

Figure 7.9 shows how both the enzyme and the coenzyme participate in the oxida- 
tion of lactate to pyruvate catalyzed by lactate dehydrogenase. In this mechanism, the 
coenzyme accepts a hydride ion at C-4 in the nicotinamide group. This leads to a re- 
arrangement of bonds in the ring as electrons are shuffled to the positively charged 
nitrogen atom. The enzyme provides an acid-base catalyst and suitable binding sites for 
both the coenzyme and the substrate. Note that two hydrogens are removed from lac- 
tate to produce pyruvate (Equation 7.3). One of these hydrogens is transferred to 
NAD® as a hydride ion carrying two electrons and the other is transferred to His- 195 
as a proton. The second hydrogen is subsequently released as H® in order to regenerate 
the base catalyst (His- 195). There are many examples of NAD-dependent reactions 
where the reduction of NAD® is accompanied by release of a proton so its quite common 
to see NADH + H® on one side of the equation. 

7.4 NAD© and NADP© 203 

◄ Figure 7.9 

Mechanism of lactate dehydrogenase. His-195, 
a base catalyst in the active site, abstracts 
a proton from the C-2 hydroxyl group of lac- 
tate, facilitating transfer of the hydride ion 
(H©) from C-2 of the substrate to C-4 
of the bound NAD©. Arg-171 forms an ion 
pair with the carboxylate group of the sub- 
strate. In the reverse reaction, H© is trans- 
ferred from the reduced coenzyme, NADH, 
to C-2 of the oxidized substrate, pyruvate. 


In the 1970s, structures were determined for four NAD- 
dependent dehydrogenases: lactate dehydrogenase, malate 
dehydrogenase, alcohol dehydrogenase, and glyceraldehyde 
3 -phosphate dehydrogenase. Each of these enzymes is 
oligomeric, with a chain length of about 350 amino acid 
residues. These chains all fold into two distinct domains — 
one to bind the coenzyme and one to bind the specific sub- 
strate. For each enzyme, the active site is in the cleft between 
the two domains. 

As structures of more dehydrogenases were determined, 
several conformations of the coenzyme-binding motif were ob- 
served. Many of them possess one or more similar NAD- or 
NADP-binding structures consisting of a pair of papaf} units 

known as the Rossman fold after Michael Rossman, who first 
observed them in nucleotide-binding proteins (see figure). Each 
of the Rossman fold motifs binds to one half of the NAD® din- 
ucleotide. All of these enzymes bind the coenzyme in the same 
orientation and in a similar extended conformation. 

Although many different dehydrogenases contain the 
Rossman fold motif, the rest of the structures may be very 
different and the dehydrogenases may not share significant 
sequence similarity. It’s possible that all Rossman fold- 
containing enzymes descend from a common ancestor, but 
its also possible that the motifs evolved independently in dif- 
ferent dehydrogenases. That would be another example of 
convergent evolution. 

◄ NAD-binding region of some dehydrogenases. 

(a) The coenzyme is bound in an extended 
conformation through interaction with two 
side-by-side motifs known as Rossman folds. 
The extended protein motifs form a p sheet of 
six parallel p strands. The arrow indicates the 
site where the hydride ion is added to C-4 of 
the nicotinamide group, (b) NADH bound to a 
Rossmann fold motif in rat lactate dehydroge- 
nase [PDB 3H3F]. 

[Adapted from Rossman et al. (1975). The Enzymes, 
Vol. 11, Part A, 3rd ed., P. D., Boyer, ed. (New York: 
Academic Press), pp. 61-102.] 

204 CHAPTER 7 Coenzymes and Vitamins 

▲ These yellow FADs are not flavins but 
Fish Aggregating Devices. They are buoys 
tethered to the sea floor in order to attract 
fish. This one has been deployed by the gov- 
ernment of New South Wales off the east 
coast of Australia. The strong ocean current 
is threatening to carry it off. 

7.5 FAD and FMN 

The coenzymes flavin adenine dinucleotide (FAD) and flavin mononucleotide (FMN) 
are derived from riboflavin, or vitamin B 2 . Riboflavin is synthesized by bacteria, pro- 
tists, fungi, plants, and some animals. Mammals obtain riboflavin from food. Riboflavin 
consists of the five-carbon alcohol ribitol linked to the N-10 atom of a heterocyclic ring 
system called isoalloxazine (Figure 7.10a). The riboflavin-derived coenzymes are shown in 
Figure 7.1 lb. Like NAD® and NADP® , FAD contains AMP and a diphosphate linkage. 

Many oxidoreductases require FAD or FMN as a prosthetic group. Such enzymes 
are called flavoenzymes or flavoproteins. The prosthetic group is very tightly bound, 
usually noncovalently. By binding the prosthetic groups tightly, the apoenzymes protect 
the reduced forms from wasteful reoxidation. 

FAD and FMN are reduced to FADH 2 and FMNH 2 by taking up a proton and two 
electrons in the form of a hydride ion (Figure 7.11). The oxidized enzymes are bright 
yellow as a result of the conjugated double-bond system of the isoalloxazine ring sys- 
tem. The color is lost when the coenzymes are reduced to FMNH 2 and FADH 2 . 

FMNH 2 and FADH 2 donate electrons either one or two at a time, unlike NADH 
and NADPH that participate exclusively in two-electron transfers. A partially oxidized 
compound, FAD Ft* or FMNH-, is formed when one electron is donated. These interme- 
diates are relatively stable free radicals called semiquinones. The oxidation ofFADH 2 
and FMNH 2 is often coupled to reduction of a metalloprotein containing Fe^ (in an 
[Fe-S] cluster). Because an iron-sulfur cluster can accept only one electron, the reduced 
flavin must be oxidized in two one-electron steps via the semiquinone intermediate. 
The ability of FMN to couple two-electron transfers with one-electron transfers is im- 
portant in many electron transfer systems. 

Crystals of Old Yellow Enzyme, a typi- 

cal fiavoprotein, are shown in the 7.6 Coenzyme A and Acyl Carrier Protein 

introduction to Chapter 5. Many metabolic processes depend on coenzyme A (CoA, or HS-CoA) including the 

oxidation of fuel molecules and the biosynthesis of some carbohydrates and lipids. 
This coenzyme is involved in acyl- group-transfer reactions in which simple carboxylic 
acids and fatty acids are the mobile metabolic groups. Coenzyme A has three major 
components: a 2-mercaptoethylamine unit that bears a free — SH group, the vitamin 
pantothenate (vitamin B 5 , an amide of ( 3 - alanine and pantoate), and an ADP moiety 

Figure 7.10 ► 

Riboflavin and its coenzymes. (a) Riboflavin. 
Ribitol is linked to the isoalloxazine ring sys- 
tem. (b) Flavin mononucleotide (FMN, black) 
and flavin adenine dinucleotide (FAD, black 
and blue). The reactive center is shown in red. 


h 3 c 




ch 2 







ch 2 oh 












7.6 Coenzyme A and Acyl Carrier Protein 205 


(semiquinone form) 

◄ Figure 7.1 1 

Reduction and reoxidation of FMN or FAD. The 

conjugated double bonds between N-l and 
N-5 are reduced by addition of a hydride ion 
and a proton to form FMNH 2 or FADH 2 , re- 
spectively, the hydroquinone form of each 
coenzyme. Oxidation occurs in two steps. 

A single electron is removed by a one- 
electron oxidizing agent, with loss of a pro- 
ton, to form a relatively stable free-radical 
intermediate. This semiquinone is then oxi- 
dized by removal of a proton and an electron 
to form fully oxidized FMN or FAD. These 
reactions are reversible. 

FMNH 2 or FADH 2 
(hydroquinone form) 

whose 3' -hydroxyl group is esterified with a third phosphate group (Figure 7.12a). 
The reactive center of CoA is the — SH group. Acyl groups covalently attach to the 
— SH group to form thioesters. A common example is acetyl CoA (Figure 7.13), where 
the acyl group is an acetyl moiety. Acetyl CoA is a “high energy” compound due to the 
thioester linkage (Section 19.8). Coenzyme A was originally named for its role as the 

v Figure 7.12 

Coenzyme A and acyl carrier protein (ACP). 

(a) In coenzyme A, 2-mercaptoethylamine 
is bound to the vitamin pantothenate, which 
in turn is bound via a phosphoester linkage 
to an ADP group that has an additional 
3'-phosphate group. The reactive center is 
the thiol group (red), (b) In acyl carrier 
protein, the phosphopantetheine prosthetic 
group, which consists of the 2-mercap- 
toethylamine and pantothenate moieties of 
coenzyme A, is esterified to a serine residue 
of the protein. 




HS — CH 2 — CH 2 — N — C — CH 2 — CH 2 — N — C — CH 
H H | 


CH 3 O 


C — CH 2 — O — P — O — CH 2 — CH Serine 

1 e 1 

ch 3 

Phosphopantetheine prosthetic group 


206 CHAPTER 7 Coenzymes and Vitamins 



H 3 c — c — S — CoA 
Acetyl CoA 

▲ Figure 7.13 
Acetyl CoA 

acetylation coenzyme. We will see acetyl CoA frequently when we discuss the metabo- 
lism of carbohydrates, fatty acids, and amino acids. 

Phosphopantetheine, a phosphate ester containing the 2-mercaptoethylamine and 
pantothenate moieties of coenzyme A, is the prosthetic group of a small protein (77 
amino acid residues) known as the acyl carrier protein (ACP). The prosthetic group is 
esterified to ACP via the side-chain oxygen of a serine residue (Figure 7.12b). The — SH 
of the prosthetic group of ACP is acylated by intermediates in the biosynthesis of fatty 
acids (Chapter 16). 

The metabolic role of pyruvate decar- 
boxylase will be encountered in 
Section 1 1.3. Transketolases are dis- 
cussed in Section 12.9. The role of 
TDP as a coenzyme in pyruvate dehy- 
drogenase is described in Section 13.2. 

7.7 Thiamine Diphosphate 

Thiamine (or vitamin BJ contains a pyrimidine ring and a positively charged thia- 
zolium ring (Figure 7.14a). The coenzyme is thiamine diphosphate (TDP), also called 
thiamine pyrophosphate (TPP) in the older literature (Figure 7.14b). TDP is synthe- 
sized from thiamine by enzymatic transfer of a pyrophosphoryl group from ATP. 

About half a dozen decarboxylases (carboxylases) are known to require TDP as a 
coenzyme. For example, TDP is the prosthetic group of yeast pyruvate decarboxylase 
whose mechanism is shown in Figure 7.15. TDP is also a coenzyme involved in the 
oxidative decarboxylation of a-keto acids other than pyruvate. The first steps in those 
reactions proceed by the mechanism shown in Figure 7.15. In addition, TDP is a pros- 
thetic group for enzymes known as transketolases that catalyze transfer between sugar 
molecules of two -carbon groups that contain a keto group. 

Figure 7.14 ► 

Thiamine diphosphate (TDP). (a) Thiamine 
(vitamin Bi). (b) Thiamine diphosphate 
(TDP). The thiazolium ring of the coenzyme 
contains the reactive center (red). 



H 3 C ch 2 — ch 2 — OH 

© / 

CH? — N 




Thiamine (vitamin 

(b) O O 

Thiamine diphosphate 

7.8 Pyridoxal Phosphate 207 






© '* 5 

R - N ^/ 


Enz — B: 

/ e 
h3C \ ( 

q — q Pyruvate 
O \ 0° 

© f 

Enz — B — H 







R — N x S 

I ^ 

h 3 c — ch t o v 


Enz — B: 



Hz»C — C v 





R — N x /S 

H 3 c — c — OH 



Enz— B — H ' 



Enz— B -pH 





n // 

H 3 c-c-c v ^ 


Enz — B: 


R — S 


H 3 c — c — OH 



Enz — B: 

◄ Figure 7.15 

Mechanism of yeast pyruvate decarboxylase. 

The positive charge of the thiazolium ring of 
TDP attracts electrons, weakening the bond 
between C-2 and hydrogen. This proton is pre- 
sumably removed by a basic residue of the 
enzyme. Ionization generates a dipolar car- 
banion known as an ylid (a molecule with 
opposite charges on adjacent atoms). The 
negatively charged C-2 attacks the electron- 
deficient carbonyl carbon of the substrate 
pyruvate and the first product (C0 2 ) is re- 
leased. Two carbons of pyruvate are now at- 
tached to the thiazole ring as part of a reso- 
nance-stabilized carbanion. In the following 
step, protonation of the carbanion produces 
hydroxyethylthiamine diphosphate (HETDP). 
HETDP is cleaved, releasing acetaldehyde 
(the second product) and regenerating the 
ylid form of the enzyme-TDP complex. TDP 
re-forms when the ylid is protonated by the 

The thiazolium ring of the coenzyme contains the reactive center. C-2 of TDP has 
unusual reactivity; it is acidic despite its extremely high p K a in aqueous solution. Similarly, 
recent experiments indicate that the p K a value for the ionization of hydroxyethylthiamine 
diphosphate (HETDP) (i.e., formation of the dipolar carbanion) is changed from 15 in 
water to 6 at the active site of pyruvate decarboxylase. This increased acidity is attributed 
to low polarity of the active site, which also accounts for the reactivity of TDP. 

7.8 Pyridoxal Phosphate 

The B 6 family of water-soluble vitamins consists of three closely related molecules that 
differ only in the state of oxidation or amination of the carbon bound to position 4 of 
the pyridine ring (Figure 7.16a). Vitamin B 6 — most often pyridoxal or pyridoxamine — 
is widely available from plant and animal sources. Induced B 6 deficiencies in rats result 
in dermatitis and various disorders related to protein metabolism but actual vitamin 

▲ Thiamine diphosphate bound to pyruvate 
dehydrogenase. The coenzyme is bound in 
an extended conformation and the diphos- 
phate group is chelated to a magnesium 
ion (green). [PDB 1PYD] 

208 CHAPTER 7 Coenzymes and Vitamins 

Figure 7.16 ► 

Bg vitamins and pyridoxal phosphate, (a) Vita- 
mins of the B 6 family: pyridoxine, pyridoxal, 
and pyridoxamine. (b) Pyridoxal 5'-phosphate 
(PLP). The reactive center of PLP is the 
aldehyde group (red). 






Pyridoxal 5'-phosphate (PLP) 

Figure 7.17 ► 

Binding of substrate to a PLP-dependent 
enzyme. The Schiff base linking PLP to a 
lysine residue of the enzyme is replaced by 
reaction of the substrate molecule with PLP. 
The transimination reaction passes through 
a geminal-diamine intermediate, resulting 
in a Schiff base composed of PLP and the 

Internal aldimine 

B 6 deficiencies in humans are rare. Enzymatic transfer of the y-phosphoryl group from 
ATP forms the coenzyme pyridoxal 5 '-phosphate (PLP) once vitamin B 6 enters a cell 
(Figure 7.16b). 

Pyridoxal phosphate is the prosthetic group for many enzymes that catalyze a vari- 
ety of reactions involving amino acids such as isomerizations, decarboxylations, and 
side-chain eliminations or replacements. In PLP-dependent enzymes, the carbonyl 
group of the prosthetic group is bound as a Schiff base (imine) to the £- amino group of 
a lysine residue at the active site. (A Schiff base results from condensation of a primary 
amine with an aldehyde or ketone.) The enzyme-coenzyme Schiff base, shown on the 
left in Figure 7.17, is sometimes referred to as an internal aldimine. PLP is tightly bound 
to the enzyme by many weak noncovalent interactions; the additional covalent linkage 
of the internal aldimine helps prevent loss of the scarce coenzyme when the enzyme is 
not functioning. 


7.9 Vitamin C 209 


▲ Figure 7.18 

Mechanism of transaminases. An amino acid displaces lysine from the internal aldimine that links PLP to the enzyme, generating an external 
aldimine. Subsequent steps lead to the transfer of the amino group to PLP yielding an a-keto acid, which dissociates, and PMP, which remains 
bound to the enzyme. If another a-keto acid enters, each step proceeds in reverse. The amino group is transferred to the a-keto acid producing a 
new amino acid and regenerating the original PLP form of the enzyme. 

The initial step in all PLP-dependent enzymatic reactions with amino acids is the 
linkage of PLP to the a-amino group of the amino acid (formation of an external 
aldimine). When an amino acid binds to a PLP-enzyme, a transimination reaction takes 
place (Figure 7.17). This transfer reaction proceeds via a geminal- diamine intermediate 
rather than via formation of the free-aldehyde form of PLP. Note that the Schiff bases 
contain a system of conjugated double bonds in the pyridine ring ending with the posi- 
tive charge on N-l. Similar ring structures with positively charged nitrogen atoms are 
present in NAD®. The prosthetic group serves as an electron sink during subsequent 
steps in the reactions catalyzed by PLP-enzymes. Once an a- amino acid forms a Schiff 
base with PLP, electron withdrawal toward N-l weakens the three bonds to the a-carbon. 
In other words, the Schiff base with PLP stabilizes a carbanion formed when one of the 
three groups attached to the a- carbon of the amino acid is removed. Which group is 
lost depends on the chemical environment of the enzyme active site. 

Removal of the a- amino group from amino acids is catalyzed by transaminases 
that participate in both the biosynthesis and degradation of amino acids (Chapter 17). 
Transamination is the most frequently encountered PLP-dependent reaction. The 
mechanism involves formation of an external aldimine (Figure 17.17) followed by re- 
lease of the a-keto acid. The amino group remains bound to PLP forming pyridoxamine 
phosphate (PMP) (Figure 7.18). The next step in transaminase reactions is the reverse 
of the reaction shown in Figure 7.18 using a different a-keto acid as a substrate. 

7.9 Vitamin C 

The simplest vitamin is the antiscurvy agent ascorbic acid (vitamin C). Scurvy is a dis- 
ease whose symptoms include skin lesions, fragile blood vessels, loose teeth, and bleed- 
ing gums. The link between scurvy and nutrition was recognized four centuries ago 
when British navy physicians discovered that citrus juice in limes and lemons were a 
remedy for scurvy in sailors whose diet lacked fresh fruits and vegetables. It was not 
until 1919, however, that ascorbic acid was isolated and shown to be the essential di- 
etary component supplied by citrus juices. 

► Limeys is the story of Dr. James Lind and his attempt to promote citrus fruit as a cure for scurvy 
in the 1700s. 

A specific transaminase is described 
in Section 17.2B. 

The Conquest of 

sc u RVY 


■ ' 

210 CHAPTER 7 Coenzymes and Vitamins 

Chromosome 8 





p21 .3 



pi 1.21 


ql 1 .21 
ql 1 .20 

q 1 2. 1 




ql 5.0 
q21 .1 1 

q21 .80 

q21 .9 














▲ The human GULO pseudogene is located 
on the short arm of chromosome 8. 

-2H @ , -2e° 

▲ Figure 7.19 

Ascorbic acid (vitamin C) and its dehydro, oxidized form. 


Dehydroascorbic acid 

Back in the 18th century it was not easy to convince authorities that a simple solu- 
tion like citrus fruit would solve the problem of scurvy because there were many com- 
peting theories. The story of Dr. James Lind and his efforts to convince the British navy 
is just one of many stories associated with vitamin C. It shows us that scientific evidence 
is not all that’s required in order to make changes in the way we do things. Eventually, 
British sailors began to eat lemons and limes on a regular basis when they were at sea. 
Not only did this reduce the incidences of scurvy but it also gave rise to a famous nick- 
name for British sailors. They were called “limeys” although lemons were much more 
effective than limes. 

Ascorbic acid is a lactone, an internal ester in which the C-l carboxylate group is 
condensed with the C-4 hydroxyl group, forming a ring structure. We now know that 
ascorbic acid is not a coenzyme but acts as a reducing agent in several different enzy- 
matic reactions (Figure 7.19). The most important of these reactions is the hydroxyla- 
tion of collagen (Section 4.12). Most mammals can synthesize ascorbic acid but guinea 
pigs, bats, and some primates (including humans) lack this ability and must therefore 
rely on dietary sources. 

In most cases, we don’t know very much about how certain enzymes disappeared 
from some species leading to a reliance on external sources for some essential metabo- 
lites. Most of the presumed gene disruption events happened so far in the distant past 
that few traces remain in modern genomes. The loss of ability to make vitamin C is an 
exception to that rule and serves as an instructive example of evolution. 

Ascorbic acid is synthesized from D-glucose in a five-step pathway involving four 
enzymes (the last step is spontaneous). The last enzyme in the pathway is L-glucono- 



H — C — OH 

H — C — OH 


HO — C — H 

Enz r HO-C-H 

H — C — OH 

H — C — OH 

H — C — OH 

H — C — OH 

CH 2 OH 






acid lactone 



CH 2 OH 

ch 2 oh 








oxidase (GULO) 

▲ Figure 7.20 

Biosynthesis of ascorbic acid (vitamin C). 

L-ascorbic acid is synthesized from D-glu- 
cose. The last enzymatic step is catalyzed by 
L-glucono-gamma-lactone oxidase (GULO), 
an enzyme that is missing in most primates. 

7.10 Biotin 211 

Rat GULO gene 


-I I 







Human GULO pseudogene 

◄ Figure 7.21 

Comparison of the intact rat GULO gene and the 
human pseudogene. The human pseudogene 
is missing the first six exons and exon 11. 

In addition, there are many mutations in the 
remaining exons that prevent them from pro- 
ducing protein product. 

gamma-lactone oxidase (GULO) (Figure 7.20). GULO (the enzyme) is not present in 
primates of the haplorrhini family (monkeys and apes), but it is present in the strepsir- 
rhini (lemurs, lorises etc.). These groups diverged about 80 million years ago. This led to 
the prediction that the GULO gene would be absent or defective in the monkeys and 
apes but intact in the other primates. 

The prediction was confirmed with the discovery of a human GULO pseudogene 
on chromosome 8 in a block of genes that contains an active GULO gene in other ani- 
mals. A comparison of the human pseudogene and a functional rat gene reveals many 
differences (Figure 7.21). The human pseudogene is missing the first six exons of the 
normal gene plus exon 11. The pseduogene in other apes is also missing these exons in- 
dicating that the ancestor of all apes had a similar defective GULO gene. 

The original mutation that made the GULO gene inactive isn’t known. Once inac- 
tivated, the pseudogene accumulated additional mutations that became fixed by 
random genetic drift. We can assume that lack of ability to synthesize vitamin C was 
not detrimental in these species because they obtained sufficient quantities in their 
normal diet. 

7.10 Biotin 

Biotin is a prosthetic group for enzymes that catalyze carboxyl group transfer reactions 
and ATP-dependent carboxylation reactions. Biotin is covalently linked to the active 
site of its host enzyme by an amide bond to the £-amino group of a lysine residue 
(Figure 7.22). 


“i r 


HNi bNH * 

\ / 


/ \ II I 

H 2 C x ^ch — ch 2 — ch 2 — ch 2 — ch 2 — c — N — ch 2 — ch 2 — ch 2 — ch 2 — CH 


Enzyme-bound biotin 


◄ Figure 7.22 

Enzyme-bound biotin. The carboxylate group 
of biotin is covalently bound via amide link- 
age to the £-amino group of a lysine residue 
(blue). The reactive center of the biotin 
moiety is N-l (red). 

The pyruvate carboxylase reaction demonstrates the role of biotin as a carrier of 
carbon dioxide (Figure 7.23). In this ATP-dependent reaction, pyruvate, a three-carbon 
acid, reacts with bicarbonate forming the four-carbon acid oxaloacetate. Enzyme- 
bound biotin is the intermediate carrier of the mobile carboxyl metabolic group. The 
pyruvate carboxylase reaction is an important C0 2 fixation reaction. It is required in 
the gluconeogenesis pathway (Chapter 11). 

Biotin was first identified as an essential factor for the growth of yeast. Biotin defi- 
ciency is rare in humans or animals on normal diets because biotin is synthesized by 
intestinal bacteria and is required only in very small amounts (micrograms per day). A 
biotin deficiency can be induced, however, by ingesting raw egg whites that contain a 
protein called avidin. Avidin binds tightly to biotin making it unavailable for absorption 

212 CHAPTER 7 Coenzymes and Vitamins 







Enol pyruvate | 

C — 0° 

Biotin Carboxybiotin 



C = 0 


▲ Figure 7.23 

Reaction catalyzed by pyruvate carboxylase. First, biotin, bicarbonate, and ATP react to form carboxybiotin. The carboxybiotinyl-enzyme complex provides 
a stable, activated form of CO 2 that can be transferred to pyruvate. Next, the enolate form of pyruvate attacks the carboxyl group of carboxybiotin, 
forming oxaloacetate and regenerating biotin. 

from the intestinal tract. Avidin is denatured when eggs are cooked and it loses its affin- 
ity for biotin. 

A variety of laboratory techniques take advantage of the high affinity of avidin for 
biotin. For example, a substance to which biotin is covalently attached can be extracted 
from a complex mixture by affinity chromatography (Section 3.6) on a column of im- 
mobilized avidin. The association constant for biotin and avidin is about 10 15 M -1 — 
one of the tightest binding interactions known in biochemistry (see Section 4.9). 


George Beadle and Edward Tatum wanted to test the idea 
that each gene encoded a single enzyme in a metabolic path- 
way. It was back in the late 1930s and this correspondence, 
which we now take for granted, was still a hypothesis. Re- 
member, this was a time when it wasn’t even clear whether 
genes were proteins or some other kind of chemical. 

Beadle and Tatum chose the fungus Neurospora crassa 
for their experiments. Neurospora grows on a well-defined 
medium needing only sugar and biotin (vitamin B 7 ) as sup- 
plements. They reasoned that by irradiating Neurospora 
spores with X rays they could find mutants that would grow 
on rich supplemented medium but not on the simple defined 
medium. All they had to do next was identify the one supple- 
ment that needed to be added to the minimal medium to 
correct the defect. This would identify a gene for an enzyme 
that synthesized the now- essential supplement. 

The 299th mutant required vitamin B 6 and the 1085th 
mutant required vitamin B x . The B 6 and B x biosynthesis 
pathways were the first two pathways to be identified in this 
set of experiments. Later on, they worked out the genes/en- 
zymes used in the tryptophan pathway. The results were pub- 
lished in 1941 and Beadle and Tatum received the Nobel 
Prize in Physiology or Medicine in 1958. 

▲ Neurospora crassa growing on defined medium in a test tube. The 

strains on the right are producing orange carotenoid and the ones on 
the left are nonproducing strains. 

(Source: Courtesy of Manchester University, United Kingdom). 

7.11 Tetrahydrofolate 


7.11 Tetrahydrofolate 

The vitamin folate was first isolated in the early 1940s from green leaves, liver, and yeast. 
Folate has three main components: pterin (2-amino-4-oxopteridine), ap-aminobenzoic 
acid moiety, and a glutamate residue. The structures of pterin and folate are shown in 
Figures 7.24a and 7.24b. Humans require folate in their diets because we cannot synthe- 
size the pterin-p-aminobenzoic acid intermediate (PABA) and we cannot add glutamate 
to exogenous PABA. 

The coenzyme forms of folate, known collectively as tetrahydrofolate, differ from 
the vitamin in two respects: they are reduced compounds (5,6,7,8-tetrahydropterins), 
and they are modified by the addition of glutamate residues bound to one another 
through y- glutamyl amide linkages (Figure 7.24c). The anionic polyglutamyl moiety, 
usually five to six residues long, participates in the binding of the coenzymes to en- 
zymes. When using the term tetrahydrofolate , keep in mind that it refers to compounds 
that have polyglutamate tails of varying lengths. 

Tetrahydrofolate is formed from folate by adding hydrogen to positions 5, 6, 7, and 
8 of the pterin ring system. Folate is reduced in two NADPH-dependent steps in a reac- 
tion catalyzed by dihydrofolate reductase (DHFR). 

NADPH + H @ NADPH + H @ 

Folate 7,8-Dihydrofolate 5,6,7,8-Tetrahydrofolate 


The primary metabolic function of dihydrofolate reductase is the reduction of di- 
hydrofolate produced during the formation of the methyl group of thymidylate 
(dTMP) (Chapter 18). This reaction, which uses a derivative of tetrahydrofolate, is an 
essential step in the biosynthesis of DNA. Because cell division cannot occur when DNA 
synthesis is interrupted, dihydrofolate reductase has been extensively studied as a target 
for chemotherapy in the treatment of cancer (Box 18.4). In most species, dihydrofolate 
reductase is a relatively small monomeric enzyme that has evolved efficient binding sites 
for the two large substrates (folate and NADPH) (Figure 6.12). 

▼ Figure 7.24 

Pterin, folate, and tetrahydrofolate. Pterin 
(a) is part of folate (b), a molecule contain- 
ing p-ami nobenzoate (red) and glutamate 
(blue), (c) The polyglutamate forms of 
tetrahydrofolate usually contain five or six 
glutamate residues. The reactive centers 
of the coenzyme, N-5 and N-10, are shown 
in red. 

Tetrahydrofolate (Tetrahydrofolyl polyglutamate) 

214 CHAPTER 7 Coenzymes and Vitamins 

Figure 7.25 ► 

One-carbon derivatives of tetra hydrofolate. 

The derivatives can be interconverted enzy- 
matically by the routes shown. (R represents 
the benzoyl polyglutamate portion of 

H 9 N 

C — N — R 


5-Methyltetrahydrofolate 5 # 10-Methylenetetrahydrofolate 

▲ Many fruits and vegetables contain adequate 
supplies of folate. Yeast and liver products 
are also excellent sources of folate. 

H 7 N 

CH — CH — CH q 


▲ Figure 7.26 

5,6,7,8-Tetrahydrobiopterin. The hydrogen 
atoms lost on oxidation are shown in red. 


'f"CH 2 

O / 

u HC — N — R 


5 # 10-Methenyltetrahydrofolate 



1 0-Formyltetrahydrofolate 

5,6,7,8-Tetrahydrofolate is required by enzymes that catalyze biochemical transfers 
of several one-carbon units. The groups bound to tetrahydrofolate are methyl, methylene, 
or formyl groups. Figure 7.25 shows the structures of several one-carbon derivatives of 
tetrahydrofolate and the enzymatic interconversions that occur among them. The one- 
carbon metabolic groups are covalently bound to the secondary amine N-5orN-10of 
tetrahydrofolate, or to both in a ring form. 10-Formyltetrahydro folate is the donor of 
formyl groups and 5, 10-methylenetetrahydro folate is the donor of hydroxymethyl 

Another pterin coenzyme, 5,6,7,8-tetrahydrobiopterin, has a three-carbon side 
chain at C-6 of the pterin moiety in place of the large side chain found in tetrahydrofo- 
late (Figure 7.26). This coenzyme is not derived from a vitamin but is synthesized by 
animals and other organisms. Tetrahydrobiopterin is the cofactor for several hydroxy- 
lases and will be encountered as a reducing agent in the conversion of phenylalanine to 
tyrosine (Chapter 17). It also is required by the enzyme that catalyzes the synthesis of 
nitric oxide from arginine (Section 17.12). 

The sale of vitamins and supplements is big business in developed nations. It’s 
often difficult to decide whether an extra supply of vitamins is necessary for good health 
because the scientific evidence is often missing or contradictory. Folate (vitamin B 9 ) 
deficiency is uncommon in normal, healthy adults and children in developed nations 
but there are documented cases of folate deficiency in pregnant women. A lack of 
tetrahydrofolate can lead to anemia and to severe defects in the developing fetus. 
While there are many fruits and vegetables that contain folate, it’s a good idea for preg- 
nant women to supplement their diet with folate in order to ensure their own health 
and that of the baby. 

7.12 Cobalamin 215 

7.12 Cobalamin 

Cobalamin (vitamin B 12 ) is the largest B vitamin and was the last to be isolated. The 
structure of cobalamin (Figure 7.27a) includes a corrin ring system that resembles the 
porphyrin ring system of heme (Figure 4.37). Note that cobalamin contains cobalt 
rather than the iron found in heme. The abbreviated structure shown in Figure 7.27b 
emphasizes the positions of two axial ligands bound to the cobalt, a benzimida- 
zole ribonucleotide below the corrin ring and an R group above it. In the coenzyme 
forms of cobalamin, the R group is either a methyl group (in methylcobalamin) or a 
5'-deoxyadenosyl group (in adenosylcobalamin). 

Cobalamin is synthesized by only a few microorganisms. It is required as a mi- 
cronutrient by all animals and by some bacteria and algae. Humans obtain cobalamin 
from foods of animal origin. A deficiency of cobalamin can lead to pernicious anemia, a 
potentially fatal disease in which there is a decrease in the production of blood cells by 
bone marrow. Pernicious anemia can also cause neurological disorders. Most victims of 
pernicious anemia do not secrete a necessary glycoprotein (called intrinsic factor) from 
the stomach mucosa. This protein specifically binds cobalamin and the complex is ab- 
sorbed by cells of the small intestine. Impaired absorption of cobalamin is now treated 
by regular injections of the vitamin. 

The role of adenosylcobalamin reflects the reactivity of its C — Co bond. The coen- 
zyme participates in several enzyme-catalyzed intramolecular rearrangements in which a 
hydrogen atom and a second group, bound to adjacent carbon atoms within a substrate, 
exchange places (Figure 7.28a). An example is the methylmalonyl-CoA mutase reaction 
(Figure 7.28b) that is important in the metabolism of odd-chain fatty acids (Chapter 16) 
and leads to the formation of succinyl CoA, an intermediate of the citric acid cycle. 

Methylcobalamin participates in the transfer of methyl groups, as in the regenera- 
tion of methionine from homocysteine in mammals. 

▲ Dorothy Crowfoot Hodgkin (1910-1994). 

Hodgkin received the Nobel Prize in 1964 
for determining the structure of vitamin B 12 
(cobalamin). The structure of insulin, shown 
in the photograph, was published in 1969. 

▲ Figure 7.27 

Cobalamin (vitamin B 12 ) and its coenzymes, (a) Detailed structure of cobalamin showing the corrin ring system (black) and 5,6-dimethylbenzimidazole 
ribonucleotide (blue). The metal coordinated by corrin is cobalt (red). The benzimidazole ribonucleotide is coordinated with the cobalt of the corrin ring 
and is also bound via a phosphoester linkage to a side chain of the corrin ring system, (b) Abbreviated structure of cobalamin coenzymes. A benzimida- 
zole ribonucleotide lies below the corrin ring, and an R group lies above the ring. 

216 CHAPTER 7 Coenzymes and Vitamins 

Figure 7.28 ► 

Intramolecular rearrangements catalyzed 
by adenosylcobalamin-dependent enzymes. 

(a) Rearrangement in which a hydrogen 
atom and a substituent on an adjacent carbon 
atom exchange places, (b) Rearrangement 
of methylmalonyl CoA to succinyl CoA, 
catalyzed by methylmalonyl-CoA mutase. 

▲ Intestinal bacteria. Normal, healthy hu- 
mans harbor billions of bacteria in their in- 
testines. There are at least several dozen 
different species. The one shown here is 
Helicobacter pylori, which causes stomach 
ulcers when it invades the stomach. The 
bacteria are sitting on the surface of the 
intestine that has many projections for ab- 
sorbing nutrients. Other common species 
are Escherichia coli and various species of 
Actinomyces and Streptococcus. These bac- 
teria help break down ingested food and 
they supply many of the essential vitamins 
and amino acids that humans need, espe- 
cially cobalamin. 

Figure 7.29 ► 

Lipoamide. Lipoic acid is bound in amide 
linkage to the e-amino group of a lysine 
residue (blue) of dihydrolipoamide acyltrans- 
ferases. The dithiolane ring of the lipoyllysyl 
groups is extended 1.5 nm from the 
polypeptide backbone. The reactive center 
of the coenzyme is shown in red. 


b — c— ; 


e — C — H 

b— C — H 

e — c— : 


H 0 

o 1 11 

°ooc— c — c- 








OOC— c — H 



H — C — H 




h — c— : — s- 

1 II 


Methylmalonyl CoA 

H 0 

Succinyl CoA 




coo 0 


h 3 n — ch 



H 3 N — CH 


cn 2 

w > 

ch 2 

oh 2 



ch 2 






s — ch 3 



In this reaction, the methyl group of 5-methyltetrahydrofolate is passed to a reactive, 
reduced form of cobalamin to form methylcobalamin that can transfer the methyl 
group to the thiol side chain of homocysteine. 

7.13 Lipoamide 

The lipoamide coenzyme is the protein-bound form of lipoic acid. Lipoic acid is some- 
times described as a vitamin but animals appear to be able to synthesize it. It is required 
by certain bacteria and protozoa for growth. Lipoic acid is an eight- carbon carboxylic 
acid (octanoic acid) in which two hydrogen atoms, on C-6 and C-8, have been replaced 
by sulfhydryl groups in disulfide linkage. Lipoic acid does not occur free — it is cova- 
lently attached via an amide linkage through its carboxyl group to the e- amino group of 
a lysine residue of a protein (Figure 7.29). This structure is found in dihydrolipoamide 
acyltransferases that are components of the pyruvate dehydrogenase complex and 
related enzymes. 

Lipoamide carries acyl groups between active sites in multienzyme complexes. For 
example, in the pyruvate dehydrogenase complex (Section 12.2), the disulfide ring of 

Lipoyllysyl group 

1.5 nm 

O C =0 

8/ C H2 6 || 

h 2 c ch — ch 2 — ch 2 — ch 2 — ch 2 — c— n— ch 2 — ch 2 — ch 2 — ch 2 — ch 

\ / 

s — S NH 


Lysine side chain 

7.14 Lipid Vitamins 


the lipoamide prosthetic group reacts with HETDP (Figure 7.15) binding its acetyl 
group to the sulfur atom attached to C-8 of lipoamide and forming a thioester. The acyl 
group is then transferred to the sulfur atom of a coenzyme A molecule generating the 
reduced (dihydrolipoamide) form of the prosthetic group. 

CH 2 

X X 

h 2 c ch— r 

h 3 c — c — s 



ch 2 

X X 

h 2 c ch 




The final step catalyzed by the pyruvate dehydrogenase complex is the oxidation of 
dihydrolipoamide. In this reaction, NADH is formed by the action of a flavoprotein 
component of the complex. The actions of the multiple coenzymes of the pyruvate de- 
hydrogenase complex show how coenzymes, by supplying reactive groups that augment 
the catalytic versatility of proteins, are used to conserve both energy and carbon building 

7.14 Lipid Vitamins 

The structures of the four lipid vitamins (A, D, E, and K) contain rings and long 
aliphatic side chains. The lipid vitamins are highly hydrophobic although each possesses 
at least one polar group. In humans and other mammals, ingested lipid vitamins are ab- 
sorbed in the intestine by a process similar to the absorption of other lipid nutrients 
(Section 16.1a). After digestion of any proteins that may bind them, they are carried to 
the cellular interface of the intestine as micelles formed with bile salts. The study of 
these hydrophobic molecules has presented several technical difficulties so research on 
their mechanisms has progressed more slowly than that on their water-soluble counter- 
parts. Lipid vitamins differ widely in their functions, as we will see below. 

A. Vitamin A 

Vitamin A, or retinol, is a 20-carbon lipid molecule obtained in the diet either directly or 
indirectly from /?- carotene. Carrots and other yellow vegetables are rich in /3- carotene, a 
40-carbon plant lipid whose enzymatic oxidative cleavage yields vitamin A (Figure 7.30). 
Vitamin A exists in three forms that differ in the oxidation state of the terminal func- 
tional group: the stable alcohol retinol, the aldehyde retinal, and retinoic acid. Their hy- 
drophobic side chain is formed from repeated isoprene units (Section 9.6). 

All three vitamin A derivatives have important biological functions. Retinoic acid is 
a signal compound that binds to receptor proteins inside cells; the ligand-receptor 

◄ Figure 7.30 

Formation of vitamin A from /2-carotene. 

Vitamin A 
(retinol form) 

CH 2 OH 

218 CHAPTER 7 Coenzymes and Vitamins 

Vitamin D 3 

▲ Figure 7.31 

Vitamin D 3 (cholecalciferol) and 1,25- 
dihydroxycholecalciferol. (Vitamin D 2 has an 
additional methyl group at C-24 and a trans 
double bond between C-22 and C-23.) 1,25- 
Dihydroxycholecalciferol is produced from 
vitamin D 3 by two separate hydroxylations. 

complexes then bind to chromosomes and can regulate gene expression during cell 
differentiation. The aldehyde retinal is a light-sensitive compound with an important 
role in vision. Retinal is the prosthetic group of the protein rhodopsin; absorption of a 
photon of light by retinal triggers a neural impulse. 

B. Vitamin D 

Vitamin D is the collective name for a group of related lipids. Vitamin D 3 (cholecalcif- 
erol) is formed nonenzymatically in the skin from the steroid 7-dehydrocholesterol 
when humans are exposed to sufficient sunlight. Vitamin D 2 , a compound related to 
vitamin D 3 (D 2 has an additional methyl group), is the additive in fortified milk. The 
active form of vitamin D 3 , 1,25-dihydroxycholecalciferol, is formed from vitamin D 3 by 
two hydroxylation reactions (Figure 7.31 ); vitamin D 2 is similarly activated. The active 
compounds are hormones that help control Ca® utilization in humans — vitamin D 
regulates both intestinal absorption of calcium and its deposition in bones. In vitamin D- 
deficiency diseases, such as rickets in children and osteomalacia in adults, bones are 
weak because calcium phosphate does not properly crystallize on the collagen matrix of 
the bones. 

C. Vitamin E 

Vitamin E, or a- tocopherol (Figure 7.32), is one of several closely related tocopherols, 
compounds having a bicyclic oxygen-containing ring system with a hydrophobic side 
chain. The phenol group of vitamin E can undergo oxidation to a stable free radical. 
Vitamin E is believed to function as a reducing agent that scavenges oxygen and free 
radicals. This antioxidant action may prevent damage to fatty acids in biological 
membranes. A deficiency of vitamin E is rare but may lead to fragile red blood cells 
and neurological damage. The deficiency is almost always caused by genetic defects in 
absorption of fat molecules. There is currently no scientific evidence to support claims 
that vitamin E supplements in the diet of normal, healthy individuals will improve 

Phylloquinone (vitamin K) are impor- 
tant components of photosynthesis 
reaction centers in bacteria, algae, 
and plants. 

D. Vitamin K 

Vitamin K (phylloquinone) (Figure 7.32) is a lipid vitamin from plants that is required 
for the synthesis of some of the proteins involved in blood coagulation. It is a coenzyme 
for a mammalian carboxylase that catalyzes the conversion of specific glutamate 
residues to y-carboxyglutamate residues (Equation 7.7). The reduced (hydroquinone) 
form of vitamin K participates in the carboxylation as a reducing agent. Oxidized 
vitamin K has to be regenerated in order to support further modifications of clotting 
factors. This is accomplished by vitamin K reductase. 

Vitamin E 

Figure 7.32 ► 

Structures of vitamin E and vitamin K. 

7.15 Ubiquinone 219 

▲ Vitamin D and the evolution of skin color. Black skin protects cells from damage by sunlight but it may inhibit formation of vitamin D. This isn’t a 
problem in Nairobi, Kenya (left) but it might be in Stockholm, Sweden (right). One hypothesis for the evolution of skin color suggests that light- 
colored skin evolved in northern climates in order to increase vitamin D production. 

Glutamate residue 

y-Carboxyglutamate residue 

'WV |\| 


Vitamin K reductase 


When calcium binds to the y-carboxyglutamate residues of the coagulation pro- 
teins, the proteins adhere to platelet surfaces where many steps of the coagulation 
process take place. 

7.15 Ubiquinone 

Ubiquinone — also called coenzyme Q and therefore abbreviated a Q” — is a lipid-soluble 
coenzyme synthesized by almost all species. Ubiquinone is a benzoquinone with four sub- 
stituents, one of which is a long hydrophobic chain. This chain of 6 to 10 isoprenoid units 
allows ubiquinone to dissolve in lipid membranes. In the membrane, ubiquinone trans- 
ports electrons between enzyme complexes. Some bacteria use menaquinone instead of 
ubiquinone (Figure 7.33 a). An analog of ubiquinone, plastoquinone (Figure 7.33b), serves 
a similar function in photosynthetic electron transport in chloroplasts (Chapter 15). 

Ubiquinone is a stronger oxidizing agent than either NAD® or the flavin coen- 
zymes. Consequently, it can be reduced by NADH or FADH 2 . Like FMN and FAD, 
ubiquinone can accept or donate two electrons one at a time because it has three oxidation 
states: oxidized Q, a partially reduced semiquinone free radical, and fully reduced QH 2 , 
called ubiquinol (Figure 7.34 ). Coenzyme Q plays a major role in membrane-associated 
electron transport. It is responsible for moving protons from one side of the membrane 
to the other by a process known as the Q cycle. (Chapter 14). The resulting proton 
gradient contributes to ATP synthesis. 

220 CHAPTER 7 Coenzymes and Vitamins 


Warfarin is an effective rat poison that has been used for 
many decades. It’s a competitive inhibitor of vitamin K reduc- 
tase, the enzyme that regenerates the reduced form of vitamin 
K (Equation 7.7). Blocking the formation of blood clotting 
factors leads to death in the rodents by internal bleeding. Ro- 
dents are very sensitive to inhibition of vitamin K reductase. 

Later on it was discovered that low concentrations of 
warfarin were effective in individuals who suffer from excessive 
blood clotting. The drug was renamed (e.g., Coumadin®) for 
use in humans since its association with rat poison had a 
somewhat negative connotation. 

Vitamin K analogs are widely used as anticoagulants in 
patients who are prone to thrombosis where they can prevent 
strokes and other embolisms. Like all medications, the dosage 
must be carefully regulated and controlled in order to prevent 
adverse effects, but in this case the dosage is even more critical. 

Since the drugs only affect the synthesis of new clotting fac- 
tors, they often take several days to have an effect.This is why 
patients will often be started at low dosages of these analogs 
and the amount of drug will be increased slowly over the 
course of many months. 

▲ Warfarin. a A rat [Rattus norvegicus). 

Figure 7.33 ► 

Structures of (a) 
menaquinone and (b) plasto- 
quinone. The hydrophobic 
tail of each molecule is 
composed of 6 to 10 five- 
carbon isoprenoid units. 





H 3 C 




h H 1 

h 3 c 



(CH 2 -C = C-CH 2 ) 6 _ 10 H 

Figure 7.34 ► 

Three oxidation states of ubiquinone. 

Ubiquinone is reduced in two one-electron 
steps via a semiquinone free-radical inter- 
mediate. The reactive center of ubiquinone 
is shown in red. 

Ubiquinone (Q) 

CH 3 

H I 

_c = c-ch 2 ) 6 _ 10 


+ e 


- pO 

Semiquinone anion (*Q 0 ) 



(ch 2 — c = c — ch 2 ) 6 _ 10 h 

+ 2H 0 

+ e 0 

- 2 H 0 

Ubiquinol (QH 2 ) 

cn 3 


-C = C-CH 2 ) 6 _ 10 H 

7.17 Cytochromes 221 

Unlike FAD or FMN, ubiquinone and its derivatives cannot accept or donate a pair 
of electrons in a single step. 

7.16 Protein Coenzymes 

Some proteins act as coenzymes. They do not catalyze reactions by themselves but are 
required by certain other enzymes. These coenzymes are called either group transfer 
proteins or protein coenzymes. They contain a functional group either as part of their 
protein backbone or as a prosthetic group. Protein coenzymes are generally smaller 
and more heat-stable than most enzymes. They are called coenzymes because they par- 
ticipate in many different reactions and associate with a variety of different enzymes. 

Some protein coenzymes participate in group transfer reactions or in oxidation- 
reduction reactions in which the transferred group is hydrogen or an electron. Metal 
ions, iron-sulfur clusters, and heme groups are reactive centers commonly found in 
these protein coenzymes. (Cytochromes are an important class of protein coenzymes 
that contain heme prosthetic groups. See Section 7.17.) Several protein coenzymes have 
two reactive thiol side chains that cycle between their dithiol and disulfide forms. For 
example, thioredoxins have cysteines three residues apart ( — Cys — X — X — Cys — ). The 
thiol side chains of these cysteine residues undergo reversible oxidation to form the 
disulfide bond of a cystine unit. We will encounter thioredoxins as reducing agents 
when we examine the citric acid cycle (Chapter 13), photosynthesis (Chapter 15), and 
deoxyribonucleotide synthesis (Chapter 18). The disulfide reactive center of thiore- 
doxin is on the surface of the protein where it is accessible to the active sites of appro- 
priate enzymes (Figure 7.35 ). 

Ferredoxin is another common oxidation-reduction coenzyme. It contains two 
iron-sulfur clusters that can accept or donate electrons (Figure 7.36 ). 

Some other protein coenzymes contain firmly bound coenzymes or portions of 
coenzymes. In Escherichia coli , a carboxyl carrier protein containing covalently bound 
biotin is one of three protein components of acetyl CoA carboxylase that catalyzes the 
first committed step of fatty acid synthesis. (In animal acetyl CoA carboxylases, the 
three protein components are fused into one protein chain.) ACP, introduced in Section 7.6, 
contains a phosphopantetheine moiety as its reactive center. The reactions of ACP 
therefore resemble those of coenzyme A. ACP is a component of all fatty acid synthases 
that have been tested. A protein coenzyme necessary for the degradation of glycine in 
mammals, plants, and bacteria (Chapter 17) contains a molecule of covalently bound 
lipoamide as a prosthetic group. 

7.17 Cytochromes 

Cytochromes are heme-containing protein coenzymes whose Fe(III) atoms undergo 
reversible one-electron reduction. Some structures of cytochromes were shown 
in Figures 4.21 and 4.24b. Cytochromes are classified as a, b , and c on the basis of 
their visible absorption spectra. The absorption spectra of reduced and oxidized 
cytochrome c are shown in Figure 7.37. Although the most strongly absorbing band is 
the Soret (or y) band, the band labeled a is used to characterize cytochromes as either 
a, b , or c. Cytochromes in the same class may have slightly different spectra; therefore, 
a subscript number denoting the peak wavelength of the a absorption band of the 
reduced cytochrome often differentiates the cytochromes of a given class (e.g., 
cytochrome fr 56 o). Wavelengths of maximum absorption for reduced cytochromes are 
given in Table 7.3. 

Figure 7.37 ► 

Comparison of the absorption spectra of oxidized (red) and reduced (blue) horse cytochrome c. The re- 
duced cytochrome has three absorbance peaks, designated a, ft, and y On oxidation, the Soret (or y) 
band decreases in intensity and shifts to a slightly shorter wavelength, whereas the a and p peaks 
disappear, leaving a single broad band of absorbance. 

The strength of coenzyme oxidizing 
agents (standard reduction potential) 
is described in Section 10.9. 

▲ Figure 7.35 

Oxidized thioredoxin. Note that the cystine 
group is on the exposed surface of the pro- 
tein. The sulfur atoms are shown in yellow. 
See Figure 4.24m for another view of thiore- 
doxin. [PDB 1ERU]. 

▲ Figure 7.36 

Ferredoxin. This ferredoxin from Pseudomonas 
aeruginosa contains two [4 Fe-4 S] iron- 
sulfur clusters that can be oxidized and re- 
duced. Ferredoxin is a common cosubstrate 
in many oxidation-reduction reactions. 

[PDB 2FG0] 


Soret band (or y) 

220 300 400 500 600 

Wavelength (nm) 


CHAPTER 7 Coenzymes and Vitamins 

Table 7.3 Absorption maxima (in nm) of major spectral bands in the visible 
absorption spectra of the reduced cytochromes 

Absorption band 

Heme protein 




Cytochrome c 




Cytochrome b 




Cytochrome a 




The classes have slightly different heme prosthetic groups (Figure 7.38 ). The heme 
of fr-type cytochromes is the same as that of hemoglobin and myoglobin (Figure 4.44). 
The heme of cytochrome a has a 17-carbon hydrophobic chain at C-2 of the porphyrin 
ring and a formyl group at C-8, whereas the fr-type heme has a vinyl group attached to 
C-2 and a methyl group at C-8. In c-type cytochromes, the heme is covalently attached 
to the apoprotein by two thioether linkages formed by addition of the thiol groups of 
two cysteine residues to vinyl groups of the heme. 

The tendency to transfer an electron to another substance, measured as a reduction 
potential, varies among individual cytochromes. The differences arise from the different 
environment each apoprotein provides for its heme prosthetic group. The reduction 
potentials of iron-sulfur clusters also vary widely depending on the chemical and physi- 
cal environment provided by the apoprotein. The range of reduction potentials among 
prosthetic groups is an important feature of membrane- associated electron transport 
pathways (Chapter 14) and photosynthesis (Chapter 15). 

CH 3 

Figure 7.38 ► l_l | 

Heme groups of (a) cytochrome a, (a) CH 2 — (CH 2 — C = C — CH 2 ) 3 — H 

(b) cytochrome b, and (c) cytochrome c. 

Summary 223 


The discovery of vitamins in the first part of the 20th century 
stimulated an enormous amount of biochemistry research. 
What were these mysterious chemicals that seemed essential 
for life? Why were they essential? 

We now take vitamins and coenzymes for granted but 
that doesn’t do justice to the workers who discovered their 
role in metabolism. Here’s a list of the scientists who received 
Nobel Prizes for their work on vitamins and coenzymes. 

Chemistry 1928: Adolf Otto Reinhold Windaus “for the serv- 
ices rendered through his research into the constitution of the 
sterols and their connection with the vitamins.” 

Physiology or Medicine 1929: Christiaan Eijkman “for his 
discovery of the antineuritic vitamin.” Sir Frederick Gow- 
land Hopkins “for his discovery of the growth-stimulating 

Chemistry 1937: Paul Karrer “for his investigations on 
carotenoids, flavins and vitamins A and B 2 .” Walter Norman 
Haworth “for his investigations on carbohydrates and vita- 
min C.” 

Physiology or Medicine 1937: Albert von Szent-Gyorgyi 
Nagyrapolt “for his discoveries in connection with the bio- 
logical combustion processes, with special reference to vita- 
min C and the catalysis of fumaric acid.” 

Chemistry 1938: Richard Kuhn “for his work on carotenoids 
and vitamins.” 

Physiology or Medicine 1943: Henrik Carl Peter Dam “for 
his discovery of vitamin K.” Edward Adelbert Doisy “for his 
discovery of the chemical nature of vitamin K.” 

Physiology or Medicine 1953: Fritz Albert Lipmann “for his 
discovery of co-enzyme A and its importance for intermedi- 
ary metabolism.” 

Chemistry 1964: Dorothy Crowfoot Hodgkin “for her deter- 
minations by X-ray techniques of the structures of important 
biochemical substances.” 

Chemistry 1970: Luis F. Leloir “for his discovery of sugar nu- 
cleotides and their role in the biosynthesis of carbohydrates.” 

Chemistry 1997: Paul D. Boyer and John E. Walker “for their 
elucidation of the enzymatic mechanism underlying the syn- 
thesis of adenosine triphosphate (ATP).” 

▲ Nobel Medals. Chemistry (left), Physiology or Medicine (right). 


1. Many enzyme- catalyzed reactions require cofactors. Cofactors in- 
clude essential inorganic ions and group-transfer reagents called 
coenzymes. Coenzymes can either function as cosubstrates or re- 
main bound to enzymes as prosthetic groups. 

2. Inorganic ions, such as K®, Mg®, Ca®, Zn®, and Fe®, may 
participate in substrate binding or in catalysis. 

3. Some coenzymes are synthesized from common metabolites; oth- 
ers are derived from vitamins. Vitamins are organic compounds 
that must be supplied in small amounts in the diets of humans 
and other animals. 

4. The pyridine nucleotides, NAD© and NADP©, are coenzymes 
for dehydrogenases. Transfer of a hydride ion (H®) from a spe- 
cific substrate reduces NAD© or NADP© to NADH or NADPH, 
respectively, and releases a proton. 

5. The coenzyme forms of riboflavin — FAD and FMN — are tightly 
bound as prosthetic groups. FAD and FMN are reduced by 
hydride (two-electron) transfers to form FADH 2 and FMNH 2 , re- 
spectively. The reduced flavin coenzymes donate electrons one or 
two at a time. 

6. Coenzyme A, a derivative of pantothenate, participates in acyl- 
group-transfer reactions. Acyl carrier protein is required in the 
synthesis of fatty acids. 

7. The coenzyme form of thiamine is thiamine diphosphate (TDP), 
whose thiazolium ring binds the aldehyde generated on decar- 
boxylation of an a-keto acid substrate. 

8. Pyridoxal 5 '-phosphate is a prosthetic group for many enzymes 
in amino acid metabolism. The aldehyde group at C-4 of PLP 
forms a Schiff base with an amino acid substrate, through which 
it stabilizes a carbanion intermediate. 

9. Vitamin C is a vitamin but not a coenzyme. It’s a substrate in 
several reactions including those required in the synthesis of 
collagen. Vitamin C deficiency causes scurvy. Primates need an 
external source of vitamin C because they have lost one of the 
key enzymes required for its synthesis. The gene for this enzyme 
is a pseudogene in certain primate genomes. 

10. Biotin, a prosthetic group for several carboxylases and carboxyl- 
transferases, is covalently linked to a lysine residue at the enzyme 
active site. 

11. Tetrahydrofolate is a reduced derivative of folate and participates 
in the transfer of one-carbon units at the oxidation levels of 
methanol, formaldehyde, and formic acid. Tetrahydrobiopterin is 
a reducing agent in some hydroxylation reactions. 

12. The coenzyme forms of cobalamin — adenosylcobalamin and 
methylcobalamin — contain cobalt and a corrin ring system. 
These coenzymes participate in a few intramolecular rearrange- 
ments and methylation reactions. 

13. Lipoamide, a prosthetic group for a-keto acid dehydrogenase 
multienzyme complexes, accepts an acyl group, forming a thioester. 

14. The four fat-soluble, or lipid, vitamins are A, D, E, and K. These 
vitamins have diverse functions. 

224 CHAPTER 7 Coenzymes and Vitamins 

15. Ubiquinone is a lipid- soluble electron carrier that transfers elec- 
trons one or two at a time. 

16. Some proteins, such as acyl carrier protein and thioredoxin, act as 
coenzymes in group-transfer reactions or in oxidation-reduction 
reactions in which the transferred group is hydrogen or an electron. 


1. For each of the following enzyme-catalyzed reactions, determine 
the type of reaction and the coenzyme that is likely to participate. 


(a) CH 3 — CH— COO© » CH 3 — C— COO© 

17. Cytochromes are small, heme- containing protein coenzymes that 
participate in electron transport. They are differentiated by their 
absorption spectra. 

O O 

ii n ii 

(b) ch 3 — ch 2 — c— coo© » ch 3 — ch 2 — C — H + co 2 

o o 

11 n n 11 

(c) CH 3 — C— S-CoA + HC0 3 © + ATP > ©OOC — CH 2 — C — S-CoA + ADP + P, 

CH 3 O O 

(d) ©OOC— CH — C— S-CoA > ©OOC— CH 2 — CH 2 — C —S-CoA 



(e) CH 3 — CH— TPP + HS-CoA » CH 3 — C — S-CoA + TPP 

2. List the coenzymes that 

(a) participate as oxidation-reduction reagents. 

(b) act as acyl carriers. 

(c) transfer methyl groups. 

(d) transfer groups to and from amino acids. 

(e) are involved in carboxylation or decarboxylation reactions. 

3. In the oxidation of lactate to pyruvate by lactate dehydrogenase 
(LDH), NAD® is reduced in a two-electron transfer process from 
lactate. Since two protons are removed from lactate as well, is it cor- 
rect to write the reduced form of the coenzyme as NADH 2 ? Explain. 




h 3 c— c— coo© 

h 3 c— c— coo© 




4. Succinate dehydrogenase requires FAD to catalyze the oxidation 
of succinate to fumarate in the citric acid cycle. Draw the isoalloxazine 
ring system of the cofactor resulting from the oxidation of succi- 
nate to fumarate and indicate which hydrogens in FADH 2 are 
lacking in FAD. 

©ooc— ch 2 — ch 2 — coo© 



©OOC — CH = CH — COO© 

5. What is the common structural feature of NAD®, FAD, and 
coenzyme A? 

6. Certain nucleophiles can add to C-4 of the nicotinamide ring of 
NAD®, in a manner similar to the addition of a hydride in the re- 
duction of NAD® to NADH. Isoniazid is the most widely used 
drug for the treatment of tuberculosis. X-ray studies have shown 
that isoniazid inhibits a crucial enzyme in the tuberculosis bac- 
terium where a covalent adduct is formed between the carbonyl 
of isoniazid and the 4' position of the nicotinamide ring of a 
bound NAD® molecule. Draw the structure of this NAD-isoni- 
azid inhibitory adduct. 




7. A vitamin B 6 deficiency in humans can result in irritability, 
nervousness, depression, and sometimes convulsions. These 
symptoms may result from decreased levels of the neurotrans- 
mitters serotonin and norepinephrine, which are metabolic de- 
rivatives of tryptophan and tyrosine, respectively. How could a 
deficiency of vitamin B 6 result in decreased levels of serotonin 
and norepinephrine? 

Problems 225 



8. Macrocytic anemia is a disease in which red blood cells mature 
slowly due to a decreased rate of DNA synthesis. The red blood cells 
are abnormally large (macrocytic) and are more easily ruptured. 
How could the anemia be caused by a deficiency of folic acid? 

9 . A patient suffering from methylmalonic aciduria (high levels of 
methylmalonic acid) has high levels of homocysteine and low 
levels of methionine in the blood and tissues. Folic acid levels are 

(a) What vitamin is likely to be deficient? 

(b) How could the deficiency produce the symptoms listed above? 

(c) Why is this vitamin deficiency more likely to occur in a per- 
son who follows a strict vegetarian diet? 

10 . Alcohol dehydrogenase (ADH) from yeast is a metalloenzyme 
that catalyzes the NAD® -dependent oxidation of ethanol to ac- 
etaldehyde. The mechanism of yeast ADH is similar to that of 
lactate dehydrogenase (LDH) (Figure 7.9) except that the zinc 
ion of ADH occupies the place of His- 195 in LDH. 

(a) Draw a mechanism for the oxidation of ethanol to acetalde- 
hyde by yeast ADH. 

(b) Does ADH require a residue analogous to Arg-171 in LDH? 

11. In biotin- dependent transcarboxylase reactions, an enzyme trans- 
fers a carboxyl group between substrates in a two-step process 
without the need for ATP or bicarbonate. The reaction catalyzed 
by the enzyme methylmalonyl CoA- pyruvate transcarboxylase is 
shown below. Draw the structures of the products expected from 
the first step of the reaction. 

ch 3 o o 

o 1 11 11 

©OOC— CH— C— S-CoA + CH 3 — C— COO© 

Methylmalonyl CoA Pyruvate 


CH 3 — CH 2 — C— S-CoA 
Propionyl CoA 

©OOC— ch 2 — c — coo© 

12 . (a) Histamine is produced from histidine by the action of a de- 
carboxylase. Draw the external aldimine produced by the re- 
action of histidine and pyridoxal phosphate at the active site 
of histidine decarboxylase. 

(b) Since racemization of amino acids by PLP-dependent en- 
zymes proceeds via Schiff base formation, would racemiza- 
tion of L-histidine to D-histidine occur during the histidine 
decarboxylase reaction? 

13 . (a) Thiamine pyrophosphate is a coenzyme for oxidative decar- 
boxylation reactions in which the keto carbonyl carbon is ox- 
idized to an acid or an acid derivative. Oxidation occurs by 
removal of two electrons from a resonance- stabilized carban- 
ion intermediate. What is the mechanism for the reaction 
pyruvate + HS-CoA —> acetyl CoA + C0 2 , beginning from 
the resonance-stabilized carbanion intermediate formed after 
decarboxylation (Figure 7.15) (such as a thioester in the case 

(b) Pyruvate dehydrogenase (PDH) is an enzyme complex that 
catalyzes the oxidative decarboxylation of pyruvate to acetyl 
CoA and C0 2 in a multistep reaction. The oxidation and 
acetyl-group transfer steps require TDP and lipoic acid in 
addition to other coenzymes. Draw the chemical structures 
for the molecules in the following two steps in the PDH 

HETDP + lipoamide » acetyl-TDP + dihydrolipoamide » 

TDP + acetyl-dihydrolipoamide 

(c) In a transketolase enzyme TDP-dependent reaction, the 
resonance-stabilized carbanion intermediate shown adjacent 
is generated as an intermediate. This intermediate is then in- 
volved in a condensation reaction (resulting in C — C bond 
formation) with the aldehyde group of erythrose 4-phos- 
phate (E4P) to form fructose 6-phosphate (F6P). Starting 
from the carbanion intermediate, show a mechanism for this 
transketolase reaction. (Fischer projections of carbohydrate 
structures are sometimes drawn as shown here.) 


H— C— OH 




H— C— OH 

hoch 2 — c — oh 


ch 2 opo 3 © 

Intermediate Erythrose 


CH 2 OH 
C = 0 


HO— C — H 


H— C— OH 


H— C— OH 

CH 2 0P0 3 © 



226 CHAPTER 7 Coenzymes and Vitamins 

Selected Readings 

Metal Ions 

Berg, J. M. (1987). Metal ions in proteins: struc- 
tural and functional roles. Cold Spring Harbor 
Symp. Quant. Biol 52:579-585. 

Rees, D. C. (2002). Great metalloclusters in enzy- 
mology. Annu. Rev. Biochem. 71: 221-246. 

Specific Cofactors 

Banerjee, R., and Ragsdale, S.W. (2003). The many 
faces of vitamin B 12 : catalysis by cobalmin- 
dependent enzymes. Annu. Rev. Biochem. 

Bellamacina, C. R. (1996). The nicotinamide 
dinucleotide binding motif: a comparison of 
nucleotide binding proteins. FASEB J. 

Blakley, R. L., and Benkovic, S. J., eds. (1985). 
Folates and Pterins, Vol. 1 andVol. 2. (New York: 
John Wiley 8c Sons). 

Chiang, R K., Gordon, R. K., Tal, J., Zeng, G. C., 
Doctor, B. P., Pardhasaradhi, K., and McCann, 

P. P. (1996). S-Adenosylmethionine and methylation. 
FASEB J. 10:471-480. 

Coleman, J. E. (1992). Zinc proteins: enzymes, stor- 
age proteins, transcription factors, and replication 
proteins. Annu. Rev. Biochem. 61:897-946. 

Ghisla, S., and Massey, V. (1989). Mechanisms of 
flavoprotein-catalyzed reactions. Eur. J. Biochem. 

Hayashi, H., Wada, H., Yoshimura, T., Esaki, N., 
and Soda, K. (1990). Recent topics in pyridoxal 
5 '-phosphate enzyme studies. Annu. Rev. Biochem. 

Jordan, F. (1999). Interplay of organic and biologi- 
cal chemistry in understanding coenzyme mecha- 
nisms: example of thiamin diphosphate-dependent 
decarboxylations of 2-oxo acids. FEBS Lett. 

Jordan, F., Li, EL, and Brown, A. (1999). Remark- 
able stabilization of zwitterionic intermediates 
may account for a billion-fold rate acceleration by 
thiamin diphosphate- dependent decarboxylases. 
Biochem. 38:6369-6373. 

Jurgenson, C. T., Begley, T. P. and Ealick, S. E. 
(2009). The structural and biochemical founda- 
tions of thiamin biosynthesis. Ann. Rev. Biochem. 

Knowles, J. R. (1989). The mechanism of biotin- 
dependent enzymes .Annu. Rev. Biochem. 58:195-221. 

Ludwig, M. L., and Matthews, R. G. (1997). 
Structure-based perspectives on B 12 -dependent 
enzymes .Annu. Rev. Biochem. 66:269-313. 

Palfey, B. A., Moran, G. R., Entsch, B., Ballou, D. P., 
and Massey, V. (1999). Substrate recognition by 
“password” in p-hydroxybenzoate hydroxylase. 
Biochem. 38:1153-1158. 

NAD-Binding Motifs 

Bellamacina, C. R. (1996). The nictotinamide 
d inucleotide binding motif: a comparison of nu- 
cleotide binding proteins. FASEB /. 10:1257-1269. 

Rossman, M. G., Liljas, A., Branden, C.-L, and 
Banaszak, L. J. (1975). Evolutionary and structural 
relationships among dehydrogenases. In The Enzymes. 
Vol. 11, Part A, 3rd ed., P. D., Boyer, ed. (New York: 
Academic Press), pp. 61-102. 

Wilks, H. M., Hart, K. W., Feeney, R., Dunn, C. R., 
Muirhead, H., Chia, W. N., Barstow, D. A., Atkin- 
son, T., Clarke, A. R., and Holbrook, J. J. (1988). 

A specific, highly active malate dehydrogenase by 
redesign of a lactate dehydrogenase framework. 
Science 242:1541-1544. 









o c 










_ o 

° o o o 

° o 


o o 


° c 



o o 


C arbohydrates (also called saccharides) are — on the basis of mass — the most 
abundant class of biological molecules on Earth. Although all organisms can 
synthesize carbohydrate, much of it is produced by photosynthetic organ- 
isms, including bacteria, algae, and plants. These organisms convert solar energy to 
chemical energy that is then used to make carbohydrate from carbon dioxide. Carbo- 
hydrates play several crucial roles in living organisms. In animals and plants, carbohy- 
drate polymers act as energy storage molecules. Animals can ingest carbohydrates 
that can then be oxidized to yield energy for metabolic processes. Polymeric carbohy- 
drates are also found in cell walls and in the protective coatings of many organisms. 
Other carbohydrate polymers are marker molecules that allow one type of cell to rec- 
ognize and interact with another type. Carbohydrate derivatives are found in a num- 
ber of biological molecules, including some coenzymes (Chapter 7) and the nucleic 
acids (Chapter 19). 

The name carbohydrate , “hydrate of carbon,” refers to their empirical formula 
(CH 2 0) n , where n is 3 or greater ( n is usually 5 or 6 but can be up to 9). Carbohydrates 
can be described by the number of monomeric units they contain. Monosaccharides are 
the smallest units of carbohydrate structure. Oligosaccharides are polymers of two to 
about 20 monosaccharide residues. The most common oligosaccharides are disaccha- 
rides, which consist of two linked monosaccharide residues. Polysaccharides are 
polymers that contain many (usually more than 20) monosaccharide residues. 
Oligosaccharides and polysaccharides do not have the empirical formula (CH 2 0) n be- 
cause water is eliminated during polymer formation. The term glycan is a more general 
term for carbohydrate polymers. It can refer to a polymer of identical sugars (homoglycan) 
or of different sugars (heteroglycan). 

Glycoconjugates are carbohydrate derivatives in which one or more carbohydrate 
chains are linked covalently to a peptide, protein, or lipid. These derivatives include pro- 
teoglycans, peptidoglycans, glycoproteins, and glycolipids. 

In this chapter, we discuss nomenclature, structure, and function of monosaccha- 
rides, disaccharides, and the major homoglycans — starch, glycogen, cellulose, and 

Molecular biology has dealt largely 
on the triad of DNA , RNA and pro- 
tein. Biochemistry is concerned with 
all the molecules of the cell. Excluded 
from the province of molecular biol- 
ogy have been most of the structures 
and functions essential for growth 
and maintenance: carbohydrates , 
coenzymes ; lipids , and membranes. 

— Arthur Korn berg 
"For the love of enzymes: the 
odyssey of a biochemist" (1 989) 

Photosynthesis is described in detail in 
Chapter 15. 

Top: Darkling beetle. The exoskeletons of insects contain chitin, a homoglycan. 


228 CHAPTER 8 Carbohydrates 


A Fischer projection is a convention 
designed to convey information about the 
stereochemistry of a molecule. It does not 
resemble the actual conformation of the 
molecule in solution. 






H — C — OH 



For each chiral carbon atom in a 
Fischer projection the vertical bonds 
project into the plane of the page and 
the horizontal bonds project upward 
toward the viewer. 

Mirror plane 

L-Glyceraldehyde D-Glyceraldehyde 
▲ Figure 8.2 

View of L-glyceraldehyde (left) and o-glycer- 
aldehyde (right). These molecules are drawn 
in a conformation that corresponds to the 
Fischer projections in Figure 8.1. 

chitin. We then consider proteoglycans, peptidoglycans, and glycoproteins, all of which 
contain heteroglycan chains. 

8.1 Most Monosaccharides Are Chiral Compounds 

Monosaccharides are water-soluble, white, crystalline solids that have a sweet taste. Ex- 
amples include glucose and fructose. Chemically, monosaccharides are polyhydroxy 
aldehydes, or aldoses, or polyhydroxy ketones, or ketoses. They are classified by their 
type of carbonyl group and their number of carbon atoms. As a rule, the suffix -ose is 
used in naming carbohydrates, although there are a number of exceptions. All mono- 
saccharides contain at least three carbon atoms. One of these is the carbonyl carbon, 
and each of the remaining carbon atoms bears a hydroxyl group. In aldoses, the most 
oxidized carbon atom is designated C-l and is drawn at the top of a Fischer projection. 
In ketoses, the most oxidized carbon atom is usually C-2. 

We’ve encountered Fischer projections before but now it’s time to present the con- 
vention in more detail. A Fischer projection is a two-dimensional representation of a 
three-dimensional molecule. It is designed to preserve information about the stereo- 
chemistry of a molecule. In a Fischer projection of sugars, the C-l atom is always at 
the top of the figure. For each separate chiral carbon atom, the two horizontal bonds 
project upward from the page toward you. The two vertical bonds project downward 
into the page. Remember, this applies to each chiral carbon atom, so in a carbohydrate 
with multiple carbon atoms the Fischer projection represents a molecule that curls back 
into the page. For longer molecules, the top and bottom groups may even come in vir- 
tual contact, forming a loop. The Fischer projection is a convention for preserving 
stereochemical information; it does not represent a realistic model of how a molecule 
might look in solution. 

The smallest monosaccharides are trioses, or three-carbon sugars. One- or two-carbon 
compounds having the general formula (CH 2 0)„ do not have properties typical of car- 
bohydrates (such as sweet taste and the ability to crystallize). The aldehydic triose, or 
aldotriose, is glyceraldehyde (Figure 8.1a). Glyceraldehyde is chiral because its central 
carbon, C-2, has four different groups attached to it, (Section 3.1). The ketonic triose, or 
ketotriose, is dihydroxyacetone (Figure 8.1b). It is achiral because it has no asymmetric 
carbon atom. All other monosaccharides, longer- chain versions of these two sugars, are 

The stereoisomers d- and L-glyceraldehyde are shown as ball-and-stick models in 
Figure 8.2. Chiral molecules are optically active; that is, they rotate the plane of polar- 
ized light. The convention for designating D and L isomers was originally based on the 
optical properties of glyceraldehyde. The form of glyceraldehyde that caused rotation to 
the right (dextrorotatory) was designated d and the form that caused rotation to the left 
(levorotatory) was designated l. Structural knowledge was limited when this conven- 
tion was established in the late 19th century so the configurations for the enantiomers 
of glyceraldehyde were assigned arbitrarily, with a 50% probability of error. X-ray 
crystallographic experiments later proved that the original structural assignments were 

H /O 

\ f 

H /O 

\ S 




CH 2 OH 


HO — C — H 

H — C — OH 



CH 2 OH 

CH 2 OH 

ch 2 oh 




▲ Figure 8.1 

Fischer projections of (a) glyceraldehyde and (b) dihydroxyacetone. The designations l (for left) and d 
(for right) for glyceraldehyde refer to the configuration of the hydroxyl group of the chiral carbon 
(C-2). Dihydroxyacetone is achiral. 

8.1 Most Monosaccharides Are Chiral Compounds 229 


H — C — OH 

3 CH 2 OH 


H — C — OH 


H — C — OH 


4 ch 2 oh 



H s 

HO— C — H 


H — C — OH 


ch 2 oh 



H ",c 

H ^° 


H ^° 


H “l 

— OH 


HO — c — H 

H — C — OH 

HO — C — H 


— OH 

H — C — OH 

HO — C — H 

HO — C — H 


H — C 

4 i 

— OH 

H — C — OH 

H — C — OH 

H — C — OH 

5 ch 2 oh 

CH 2 OH 

CH 2 OH 

CH 2 OH 











i i i i 


i i 





H ^° 

H ^° 

H ^° 




— u- 



HO— C — H 

1 1 

h— c— oh ho— c — h 

i i 

H — C— OH HO— C — H 


H — C— OH HO— C— H 

H — C— OH 

3 | 

H — C— OH 

HO— c — H HO— c— H 

H — C— OH H — C— OH 

HO— C — H 

H — C— OH 

4 | 

H — C— OH 

H — C— OH H — C— OH 

HO— C— H HO— C— H 

HO— C— H 

H — C— OH 

5 | 

H — C— OH 

H — C— OH H — C— OH 

H — C— OH H — C— OH 

H — C— OH 

ch 2 oh 

ch 2 oh 

ch 2 oh ch 2 oh 

ch 2 oh ch 2 oh 

CH 2 OH 



D-Glucose D-Mannose 

D-Gulose D-ldose 

D-Galactose D-Talose 

▲ Figure 8.3 

Fischer projections of the three- to six-carbon D-aldoses. The aldoses shown in blue are the most important in our study of biochemistry. 

Longer aldoses and ketoses can be regarded as extensions of glyceraldehyde and di- 
hydroxyacetone, respectively, with chiral H — C — OH groups inserted between the car- 
bonyl carbon and the primary alcohol group. Figure 8.3 shows the complete list of the 
names and structures of the tetroses (four-carbon aldoses), pentoses (five-carbon al- 
doses), and hexoses (six-carbon aldoses) related to D- glyceraldehyde. Many of these 
monosaccharides are not synthesized by most organisms and we will not encounter 
them again in this book. 

Note that the carbon atoms are numbered from the carbon of the aldehyde group 
that is assigned the number 1. By convention, sugars are said to have the D configuration 
when the configuration of the chiral carbon with the highest number — the chiral carbon 
most distant from the carbonyl carbon — is the same as that of C-2 of D- glyceraldehyde 

230 CHAPTER 8 Carbohydrates 

Figure 8.4 ► 

l- and o-glucose. Fischer projections (left) 
showing that l- and D-glucose are mirror 
images. Conformation of the extended form 
of D-glucose in solution. 

(i.e., the — OH group attached to this carbon atom is on the right side in a Fischer pro- 
jection). The arrangement of asymmetric carbon atoms is unique for each monosac- 
charide, giving each its distinctive properties. Except for glyceraldehyde (which was 
used as the standard), there is no predictable association between the absolute configu- 
ration of a sugar and whether it is dextrorotatory or levorotatory. 

It is mostly the D enantiomers that are synthesized in living cells — just as the 
L enantiomers of amino acids are more common. The L enantiomers of the 15 aldoses in 
Figure 8.3 are not shown. Recall that pairs of enantiomers are mirror images; in other 
words, the configuration at each chiral carbon is opposite. For example, the hydroxyl 
groups bound to carbon atoms 2, 3, 4, and 5 of D-glucose point right, left, right, and 
right, respectively, in the Fischer projection; those of L- glucose point left, right, left, and 
left (Figure 8.4). 

The three-carbon aldose, glyceraldehyde, has only a single chiral atom (C-2) and 
therefore only two stereoisomers. There are four stereoisomers for aldotetroses (d- and 
L-erythrose and D- and L-threose) because erythrose and threose each possess two chiral 
carbon atoms. In general, there are 2 n possible stereoisomers for a compound with n 
chiral carbons. Aldohexoses, which possess four chiral carbons, have a total of 2 4 , or 16, 
stereoisomers (the eight D aldohexoses in Figure 8.3 and their L enantiomers). 

Sugar molecules that differ in configuration at only one of several chiral centers are 
called epimers. For example, D-mannose and D-galactose are epimers of D-glucose (at C-2 
and C-4, respectively), although they are not epimers of each other (Figure 8.3). 

Longer- chain ketoses (Figure 8.5) are related to dihydroxyacetone in the same way 
that longer-chain aldoses are related to glyceraldehyde. Note that a ketose has one fewer 
chiral carbon atom than the aldose of the same empirical formula. For example, there 
are only two stereoisomers for the one ketotetrose (d- and L-erythrulose), and four 
stereoisomers for ketopentoses (d- and L-xylulose and d- and L-ribulose). Ketotetrose 
and ketopentoses are named by inserting -ul- in the name of the corresponding aldose. 
For example, the ketose xylulose corresponds to the aldose xylose. This nomenclature 
does not apply to the ketohexoses (tagatose, sorbose, psicose, and fructose) because they 
have traditional (trivial) names. 



H /O 

H ,0 

\ f 

\ s 









H— 2 C— OH 





- u- 




- u- 






- u- 



H— 4 C— OH 














CH 2 OH 

6 ch 2 oh 








8.2 Cyclization of Aldoses and Ketoses 

The optical behavior of some monosaccharides suggests they have one more chiral 
carbon atom than is evident from the structures shown in Figures 8.3 and 8.5. 
D-Glucose, for example, exists in two forms that contain five (not four) asymmetric carbons. 
The source of this additional asymmetry is an intramolecular cyclization reaction that 
produces a new chiral center at the carbon atom of the carbonyl group. This cyclization 
resembles the reaction of an alcohol with an aldehyde to form a hemiacetal or with a 
ketone to form a hemiketal (Figure 8.7). 

The carbonyl carbon of an aldose containing at least five carbon atoms or of a ke- 
tose containing at least six carbon atoms can react with an intramolecular hydroxyl 

8.2 Cyclization of Aldoses and Ketoses 231 


CH 2 OH 
C =0 


ch 2 oh 


Ketotetrose CH 2 OH 


C = 0 

H — C — OH 


ch 2 oh 



CH 2 OH 




H — C— OH 


H — C— OH 


ch 2 oh 


CH 2 OH 


HO — C — H 

CH 2 OH 








* * 

£ * 

▲ Who am I? The structures of the d sugars 
are shown in Figures 8.3 and 8.5. You can 
deduce the structures of the l configurations. 
Knowing the convention for Fischer projec- 
tions, you should have no trouble identifying 
these molecules. 

ch 2 oh 




H — C— OH 


H — C— OH 


H — C— OH 


ch 2 oh 


CH 2 OH 



HO — C — H 


H — C— OH 


H — C— OH 


ch 2 oh 


CH 2 OH 



HO — C — H 


HO — C — H 


H — C— OH 


ch 2 oh 


CH 2 OH 


HO — C — H 

H — C— OH 


ch 2 oh 


◄ Figure 8.5 

Fischer projections of the three- to six-carbon 
o-ketoses. The ketoses shown in blue are the 
most important in our study of biochemistry. 

group to form a cyclic hemiacetal or cyclic hemiketal, respectively. The oxygen atom 
from the reacting hydroxyl group becomes a member of the five- or six-membered ring 
structures (Figure 8.8). 

Because it resembles the six-membered heterocyclic compound pyran (Figure 8.6a), 
the six-membered ring of a monosaccharide is called a pyranose. Similarly, because the 
five-membered ring of a monosaccharide resembles furan (Figure 8.6b), it is called a 
furanose. Note that, unlike pyran and furan, the rings of carbohydrates do not contain 
double bonds. 

The most oxidized carbon of a cyclized monosaccharide, the one attached to two 
oxygen atoms, is referred to as the anomeric carbon. In ring structures, the anomeric car- 
bon is chiral. Thus, the cyclized aldose or ketose can adopt either of two configurations 
(designated a or /J), as illustrated for D-glucose in Figure 8.8. The a and (3 isomers are 
called anomers. 

In solution, aldoses and ketoses that form ring structures equilibrate among their vari- 
ous cyclic and open-chain forms. At 31°C, for example, D-glucose exists in an equilibrium 

▲ Figure 8.6 

(a) Pyran and (b) furan. 


CHAPTER 8 Carbohydrates 

(a) |_j© 

y O Aldehyde 







R— C — H 

H © 







▲ Figure 8.7 

Hemiacetal and hemiketal. (a) Reaction of an 
alcohol with an aldehyde to form a hemi- 
acetal. (b) Reaction of an alcohol with a 
ketone to form a hemiketal. The asterisks 
indicate the newly formed chiral centers. 

mixture of approximately 64% /J- D - glucopyr anose and 36% a-D-glucopyranose, with very 
small amounts of the furanose (Figure 8.9 ) and open-chain (Figure 8.4) forms. Similarly, 
D-ribose exists as a mixture of approximately 58.5% /3-D-ribopyranose, 21.5% a-D-ribopy- 
ranose, 13.5% /3-D-ribofuranose, and 6.5% a-D-ribofuranose, with a tiny fraction in the 
open-chain form (Figure 8.10). The relative abundance of the various forms of monosac- 
charides at equilibrium reflects the relative stabilities of each form. Although unsubstituted 
D-ribose is most stable as the /3- pyranose, its structure in nucleotides (Section 8.5c) is the 
( 3 - furanose form. 

The ring drawings shown in these figures are called Haworth projections, after 
Norman Haworth who worked on the cyclization reactions of carbohydrates and first 

Figure 8.8 ► 

Cyclization of o-glucose to form glucopyranose. 

The Fischer projection (top left) is rearranged 
into a three-dimensional representation 
(top right). Rotation of the bond between C-4 
and C-5 brings the C-5 hydroxyl group close 
to the C-l aldehyde group. Reaction of the 
C-5 hydroxyl group with one side of C-l gives 
a-D-glucopyranose; reaction of the hydroxyl 
group with the other side gives /kD-glucopy- 
ranose. The glucopyranose products are shown 
as Haworth projections in which the lower 
edges of the ring (thick lines) project in front 
of the plane of the paper and the upper 
edges project behind the plane of the paper. 
In the a-D-anomer of glucose, the hydroxyl 
group at C-l points down; in the /TD-anomer, 
it points up. 


(Fischer projection) 




(Haworth projection) 


(Haworth projection) 

8.2 Cyclization of Aldoses and Ketoses 233 

proposed these representations. He received the Nobel Prize in Chemistry in 1937 for 
his work on carbohydrate structure and the synthesis of vitamin C. 

A Haworth projection adequately indicates stereochemistry and can be easily re- 
lated to a Fischer projection: groups on the right in a Fischer projection point downwards 
in a Haworth projection. Because rotation around carbon-carbon bonds is constrained 
in the ring structure, the Haworth projection is a much more faithful representation of 
the actual conformation of sugars. 

By convention, a cyclic monosaccharide is drawn so the anomeric carbon is on the 
right and the other carbons are numbered in a clockwise direction. In a Haworth pro- 
jection, the configuration of the anomeric carbon atom is designated a if its hydroxyl 
group is cis to (on the same side of the ring as) the oxygen atom of the highest-numbered 
chiral carbon atom. It is /3 if its hydroxyl group is trans to (on the opposite side 
of the ring from) the oxygen attached to the highest-numbered chiral carbon. With 
a-D-glucopyranose, the hydroxyl group at the anomeric carbon points down; with 
/3-D-glucopyranose, it points up. 

Monosaccharides are often drawn in either the a- or /3-D-furanose or the a- or 
/3-D-pyranose form. However, you should remember that the anomeric forms of five- 
and six- carbon sugars are in rapid equilibrium. Throughout this chapter and the rest 
of the book, we draw sugars in the correct anomeric form if it is known. We refer to 
sugars in a nonspecific way (e.g., glucose) when we are discussing an equilibrium 

▲ Figure 8.9 

a-o-glucofuranose (top) and /3-o-glucofuranose 





5 CH 2 OH 

◄ Figure 8.10 

Cyclization of o-ribose to form a- and /3-d- 
ribopyranose and a- and /3-o-ribofuranose. 


(Fischer projection) 

a-D-Ribopyranose /3-D-Ribopyranose 

(Haworth projection) (Haworth projection) 

a-D-Ribofuranose /3-D-Ribofuranose 

(Haworth projection) (Haworth projection) 

234 CHAPTER 8 Carbohydrates 

▲ Galactose mutarotase. Mutarotases are en- 
zymes that catalyze the interconversion of a 
and /3 configurations. This interconversion 
involves the breaking and remaking of cova- 
lent bonds, which is why they are different 
configurations. The enzyme shown here is 
galactose mutarotase from Lactococcus 
lactis with a molecule of a-D-galactose 
in the acitve site. The bottom figure shows 
the conformation of this molecule. Can you 
identify this conformation? [PDB 1L7K] 

mixture of the various anomeric forms as well as the open-chain forms. When we 
are discussing a specific form of a sugar, however, we will refer to it precisely (e.g., 
/3-D-gluco pyranose). Also, since the d enantiomers of carbohydrates predominate in 
nature, we always assume that a carbohydrate has the D configuration unless specified 

8.3 Conformations of Monosaccharides 

Haworth projections are commonly used in biochemistry because they accurately 
depict the configuration of the atoms and groups at each carbon atom of the sugar’s 
backbone. However, the geometry of the carbon atoms of a monosaccharide ring is 
tetrahedral (bond angles near 110°), so monosaccharide rings are not actually planar. 
Cyclic monosaccharides can exist in a variety of conformations (three-dimensional 
shapes having the same configuration). Furanose rings adopt envelope conformations 
in which one of the five ring atoms (either C-2 or C-3) is out-of-plane and the remaining 
four are approximately coplanar (Figure 8.11). Furanoses can also form twist conformations 
where two of the five ring atoms are out-of-plane — one on either side of the plane 
formed by the other three atoms. The relative stability of each conformer depends on 
the degree of steric interference between the hydroxyl groups. The various conformers 
of unsubstituted monosaccharides can rapidly interconvert. 

Pyranose rings tend to assume one of two conformations, the chair conformation 
or the boat conformation (Figure 8.12). There are two distinct chair conformers and six 
distinct boat conformers for each pyranose. The chair conformations minimize steric 
repulsion among the ring substituents and are generally more stable than boat confor- 
mations. The — H, — OH, and — CH 2 OH substituents of a pyranose ring in the chair 
conformation may occupy two different positions. In the axial position the substituent 
is above or below the plane of the ring, while in the equatorial position the substituent 
lies in the plane of the ring. In pyranoses, five substituents are axial and five are equatorial. 
Whether a group is axial or equatorial depends on which carbon atom (C-l or C-4) ex- 
tends above the plane of the ring when the ring is in the chair conformation. Figure 8.13 
shows the two different chair conformers of /3-D-glucopyranose. The more stable 
conformation is the one in which the bulkiest ring substituents are equatorial (top 
structure). In fact, this conformation of /3-D-glucose has the least steric strain of any aldo- 
hexose. Pyranose rings are occasionally forced to adopt slightly different conformations, 
such as the unstable half- chair adopted by a polysaccharide residue in the active site of 
lysozyme (Section 6.6). 


Different configurations can only be 
formed by breaking and reforming 
covalent bonds. Molecules can adopt 
different conformations without breaking 
covalent bonds. 

Figure 8.1 1 ► 

Conformations of /J-o-ribofuranose. (a) Haworth 
projection, (b) C 2 -endo envelope conformation, 
(c) C 3 -endo envelope conformation, (d) Twist 
conformation. In the C 2 -endo conformation, 
C-2 lies above the plane defined by C-l, C-3, 
C-4, and the ring oxygen. In the C 3 -endo 
conformation, C-3 lies above the plane de- 
fined by C-l, C-2, C-4, and the ring oxygen. 
In the twist conformation shown, C-3 lies 
above and C-2 lies below the plane defined 
by C-l, C-4, and the ring oxygen. The planes 
are shown in yellow. 




Haworth projection 


C 2 -endo envelope conformation 

5 H 


C 3 -endo envelope conformation 


Twist conformation 

8.4 Derivatives of Monosaccharides 235 



Haworth projection 

Chair conformation 

Boat conformation 


◄ Figure 8.12 

Conformations of /J-o-glucopyranose. 

(a) Haworth projection, a chair conformation, 
and a boat conformation, (b) Bal l-and-stick 
model of a chair (left) and a boat (right) 

8.4 Derivatives of Monosaccharides 

There are many known derivatives of the basic monosaccharides. They include poly- 
merized monosaccharides, such as oligosaccharides and polysaccharides, as well as sev- 
eral classes of nonpolymerized compounds. In this section, we introduce a few mono- 
saccharide derivatives, including sugar phosphates, deoxy and amino sugars, sugar 
alcohols, and sugar acids. 

Like other polymer-forming biomolecules, monosaccharides and their derivatives 
have abbreviations used in describing more complex polysaccharides. The accepted ab- 
breviations contain three letters, with suffixes added in some cases. The abbreviations 
for some pentoses and hexoses and their major derivatives are listed in Table 8.1. We use 
these abbreviations later in this chapter. 

A. Sugar Phosphates 

Monosaccharides are often converted to phosphate esters. Figure 8.14 shows the struc- 
tures of several of the sugar phosphates we will encounter in our study of carbohydrate 
metabolism. The triose phosphates, ribose 5-phosphate, and glucose 6-phosphate are 
simple alcohol-phosphate esters. Glucose 1 -phosphate is a hemiacetal phosphate, which 
is more reactive than an alcohol phosphate. The ability of UDP- glucose to act as a glu- 
cosyl donor (Section 7.3) is evidence of this reactivity. 


▲ Figure 8.13 

The two chair conformers of /?-D-glucopyranose. 

The top conformer is more stable. 

B. Deoxy Sugars 

The structures of two deoxy sugars are shown in Figure 8.15. In these derivatives, a 
hydrogen atom replaces one of the hydroxyl groups in the parent monosaccharide. 
2-Deoxy-D-ribose is an important building block for DNA. L-Fucose (6-deoxy-L-galac- 
tose) is widely distributed in plants, animals, and microorganisms. Despite its unusual 
L configuration, fucose is derived metabolically from D-mannose. 

C. Amino Sugars 

In a number of sugars, an amino group replaces one of the hydroxyl groups in the parent 
monosaccharide. Sometimes the amino group is acetylated. Three examples of amino 

236 CHAPTER 8 Carbohydrates 

Table 8.1 Abbreviations for some monosac- 
charides and their derivatives 


or derivative Abbreviation 















Deoxy sugars 





Amino sugars 







N- Acetylgalactosamine 
N-Acetylneuraminic acid 



N-Acetyl mu ramie acid 


Sugar acids 

Glucuronic acid 


Iduronic acid 


ch 2 oh 

c = o 

1 © 

ch 2 opo 3 ^ 





-c — OH 


ch 2 opo 







▲ Figure 8.14 

Structures of several metabolically important sugar phosphates. 

sugars are shown in Figure 8.16. Amino sugars formed from glucose and galactose 
commonly occur in glycoconjugates. N-Acetylneuraminic acid (NeuNAc) is an acid 
formed from N-acetylmannosamine and pyruvate. When this compound cyclizes to 
form a pyranose, the carbonyl group at C-2 (from the pyruvate moiety) reacts with the 
hydroxyl group of C-6. NeuNAc is an important constituent of many glycoproteins and 
of a family of lipids called gangliosides (Section 9.5). Neuraminic acid and its deriva- 
tives, including NeuNAc, are collectively known as sialic acids. 






▲ Figure 8.15 

Structures of the deoxy sugars 2-deoxy-o-ribose 
and L-fucose. 

D. Sugar Alcohols 

In a sugar alcohol, the carbonyl oxygen of the parent monosaccharide has been reduced, 
producing a polyhydroxy alcohol. Figure 8.17 shows three examples of sugar alcohols. 
Glycerol and rayo-inositol are important components of lipids (Section 10.4). Ribitol is 
a component of flavin mononucleotide (FMN) and flavin adenine dinucleotide (FAD) 
(Section 7.4). In general, sugar alcohols are named by replacing the suffix -ose of the 
parent monosaccharides with -itol. 

E. Sugar Acids 

Sugar acids are carboxylic acids derived from aldoses, either by oxidation of C- 1 (the 
aldehydic carbon) to yield an aldonic acid or by oxidation of the highest-numbered 
carbon (the carbon bearing the primary alcohol) to yield an alduronic acid. The struc- 
tures of the aldonic and alduronic derivatives of glucose — gluconate and glucuronate — 
are shown in Figure 8.18. Aldonic acids exist in the open-chain form in alkaline solution 
and form lactones (intramolecular esters) on acidification. Alduronic acids can exist as 
pyranoses and therefore possess an anomeric carbon. Note that N-acetylneuraminic 
acid (Figure 8.16) is a sugar acid as well as an amino sugar. Sugar acids are important 
components of many polysaccharides. L-Ascorbic acid or vitamin C, is an enediol of a 
lactone derived from D-glucuronate (Section 7.9). 

8.5 Disaccharides and Other Glycosides 

The glycosidic bond is the primary structural linkage in all polymers of monosaccha- 
rides. A glycosidic bond is an acetal linkage in which the anomeric carbon of a sugar is 
condensed with an alcohol, an amine, or a thiol. As a simple example, glucopyranose 

8.5 Disaccharides and Other Glycosides 237 


ch 3 



/V-Acetyl-u-D-neuraminic acid 

H— 8 C— OH 
9 CH 2 OH 

A/-Acetyl-D-neuraminic acid 
(open-chain form) 

▲ Figure 8.16 

can react with methanol in an acidic solution to form an acetal (Figure 8.19). Com- structures of several amino sugars. The amino 

pounds containing glycosidic bonds are called glycosides; if glucose supplies the and acetylamino groups are shown in red. 

anomeric carbon, they are specifically termed glue os ides. The glycosides include disac- 
charides, polysaccharides, and some carbohydrate derivatives. 

A. Structures of Disaccharides 

Disaccharides are formed when the anomeric carbon of one sugar molecule interacts 
with one of several hydroxyl groups in the other sugar molecule. For disaccharides and 
other carbohydrate polymers, we must note both the types of monosaccharide residues 
that are present and the atoms that form the glycosidic bonds. In the systematic descrip- 
tion of a disaccharide we must specify the linking atoms, the configuration of the glyco- 
sidic bond, and the name of each monosaccharide residue (including its designation as 
a pyranose or furanose). Figure 8.20 presents the structures and nomenclature for four 
common disaccharides. 

Maltose (Figure 8.20a) is a disaccharide released during the hydrolysis of starch, 
which is a polymer of glucose residues. It is present in malt, a mixture obtained from 
corn or grain that is used in malted milk and in brewing. Maltose is composed of two D- 
glucose residues joined by an a-glycosidic bond. The glycosidic bond links C-l of one 
residue (on the left in Figure 8.20a) to the oxygen atom attached to C-4 of the second 
residue (on the right). Maltose is therefore a-D-glucopyranosyl-(l — > 4)-D-glucose. 

Note that the glucose residue on the left, whose anomeric carbon is involved in the gly- 
cosidic bond, is fixed in the a configuration, whereas the glucose residue on the right 
(the reducing end, as explained in Section 8.5B) freely equilibrates among the a, /3, and 
open-chain structures. (The open-chain form is present in very small amounts). The 
structure shown in Figure 8.20a is the /3-pyranose anomer of maltose (the anomer 
whose reducing end is in the /3 configuration, the predominant anomeric form). 

Cellobiose [/3-D-glucopyranosyl-(l — » 4)-D-glucose] is another glucose dimer 
(Figure 8.20b). Cellobiose is the repeating disaccharide in the structure of cellulose, a 

CH 2 OH 
HO — C — H 


ch 2 oh 



CH 2 OH 
H — C — OH 


H — C— OH 


H — C— OH 


ch 2 oh 


◄ Figure 8.17 

Structures of several sugar alcohols. Glycerol 
(a reduced form of glyceraldehyde) and myo- 
inositol (metabolically derived from glucose) 
are important constituents of many lipids. 
Ribitol (a reduced form of ribose) is a 
constituent of the vitamin riboflavin and 
its coenzymes. 

238 CHAPTER 8 Carbohydrates 



H— 2 C— OH 


HO— 3 C — H 


H— 4 C— OH 


H— 5 C— OH 


6 ch 2 oh 

(open-chain form) 


D-Glucuronate D-Glucuronate 

(open-chain form) (/3 pyranose anomer) 

▲ Figure 8.18 

Structures of sugar acids derived from 
o-glucose. (a) Gluconate and its 5-lactone, 
(b) The open-chain and pyranose forms 
of glucuronate. 

plant polysaccharide, and is released during cellulose degradation. The only difference 
between cellobiose and maltose is that the glycosidic linkage in cellobiose is (3 (it is a in 
maltose). The glucose residue on the right in Figure 8.20b, like the residue on the right 
in Figure 8.20a, equilibrates among the a, /3, and open-chain structures. 

Lactose [/3-D-galactopyranosyl-(l — » 4)-D-glucose], a major carbohydrate in milk, 
is a disaccharide synthesized only in lactating mammary glands (Figure 8.20c). Note 
that lactose is an epimer of cellobiose. The naturally occurring a anomer of lactose is 
sweeter and more soluble than the (3 anomer. The (3 anomer can be found in stale ice cream, 
where it has crystallized during storage and given a gritty texture to the ice cream. 

Sucrose [a-D-glucopyranosyl-(l — » 2)-/3-D-fructofuranoside], or table sugar, is the 
most abundant disaccharide found in nature (Figure 8.20d). Sucrose is synthesized only in 
plants. Sucrose is distinguished from the other three disaccharides in Figure 8.20 because 
its glycosidic bond links the anomeric carbon atoms of two monosaccharide residues. 
Therefore, the configurations of both the glucopyranose and fructofuranose residues in 
sucrose are fixed, and neither residue is free to equilibrate between a and (3 anomers. 

B. Reducing and Nonreducing Sugars 

Monosaccharides, and most disaccharides, are hemiacetals with a reactive carbonyl 
group. They are readily oxidized to diverse products, a property often used in their analy- 
sis. Such carbohydrates, including glucose, maltose, cellobiose, and lactose, are some- 
times called reducing sugars. Historically, reducing sugars were detected by their ability 

Figure 8.19 ► 

Reaction of glucopyranose with methanol 
produces a glycoside. In this acid-catalyzed 
condensation reaction, the anomeric — OH 
group of the hemiacetal is replaced by an 
— 0CH 3 group, forming methyl glucoside, 
an acetal. The product is a mixture of the 
a and ft anomers of methyl glucopyranoside. 

u-D-Glucopyranose Methanol 

Methyl u-D-glucopyranoside 

Methyl /3-D-glucopyranoside 

8.5 Disaccharides and Other Glycosides 


/ 3 anomer of maltose 


/ 3 anomer of cellobiose 



a anomer of lactose 




to reduce metal ions such as Cu® or Ag® to insoluble products. Carbohydrates that are 
not hemiacetals, such as sucrose, are not readily oxidized because both anomeric carbon 
atoms are fixed in a glycosidic linkage. These are classified as nonreducing sugars. 

The reducing ability of a sugar polymer is of more than analytical interest. The poly- 
meric chains of oligosaccharides and polysaccharides show directionality based on their 
reducing and nonreducing ends. There is usually one reducing end (the residue contain- 
ing the free anomeric carbon) and one nonreducing end in a linear polymer. All the in- 
ternal glycosidic bonds of a polysaccharide involve acetals. The internal residues are not 
in equilibrium with open-chain forms and thus cannot reduce metal ions. A branched 
polysaccharide has a number of nonreducing ends but only one reducing end. 

C. Nucleosides and Other Glycosides 

The anomeric carbons of sugars form glycosidic linkages not only with other sugars but 
also with a variety of alcohols, amines, and thiols. The most commonly encountered gly- 
cosides, other than oligosaccharides and polysaccharides, are the nucleosides, in which a 
purine or pyrimidine is attached by its secondary amino group to a /3-D-ribofuranose or 
/3-D-deoxyribofuranose moiety. Nucleosides are called N-glycosides because a nitrogen 
atom participates in the glycosidic linkage. Guanosine (/3-D-ribofuranosylguanine) is a 
typical nucleoside (Figure 8.21). We have already discussed ATP and other nucleotides 
that are metabolite coenzymes (Section 7.3). NAD and FAD also are nucleotides. 

Two other examples of naturally occurring glycosides are shown in Figure 8.21. 
Vanillin glucoside (Figure 8.21b) is the flavored compound in natural vanilla extract. 
/3-Galactosides constitute an abundant class of glycosides. In these compounds, a variety 
of nonsugar molecules are joined in (3 linkage to galactose. For example, galactocerebro- 
sides (see Section 9.5) are glycolipids common in eukaryotic cell membranes and can be 
hydrolyzed readily by the action of enzymes called /3-galactosidases. 

▲ Figure 8.20 

Structures of (a) maltose, (b) cellobiose, 

(c) lactose, and (d) sucrose. The oxygen atom 
of each glycosidic bond is shown in red. 

▲ Sugar cane is a major source of commercial 

There is a more complete discussion 
of nucleosides and nucleotides in 
Chapter 19. 

240 CHAPTER 8 Carbohydrates 


One of the characteristics of sugars is that they taste sweet. 
You certainly know the taste of sucrose and you probably 
know that fructose and lactose also taste sweet. So do many 
of the other sugars and their derivatives, although we don’t 
recommend that you go into a biochemistry lab and start 
tasting all the carbohydrates in those white plastic bottles on 
the shelves. 

Sweetness is not a physical property of molecules. It’s a 
subjective interaction between a chemical and taste receptors 
in your mouth. There are five different kinds of taste recep- 
tors: sweet, sour, salty, bitter, and umami (umami is like the 
taste of glutamate in monosodium glutamate). In order to 
trigger the sweet taste, a molecule like sucrose has to bind to 
the receptor and initiate a response that eventually makes it 
to your brain. Sucrose elicits a moderately strong response 
that serves as the standard for sweetness. The response to 
fructose is almost twice as strong and the response to lactose 
is only about one-fifth as strong as that of sucrose. Artificial 
sweeteners such as saccharin (Sweet’N Low®), sucralose 

(Splenda®), and aspartame (NutraSweet®) bind to the sweet- 
ness receptor and cause the sensation of sweetness. They are 
hundreds of times more sweet than sucrose. 

The sweetness receptor is encoded by two genes called 
Taslr2 and Taslr3. We don’t know how sucrose and the other 
ligands bind to this receptor even though this is a very active 
area of research. In the case of sucrose and the artifical sweet- 
eners, how can such different molecules elicit the taste of 

Cats, including lions, tigers and cheetahs, do not have a 
functional Taslr2 gene. It has been converted to a pseudo- 
gene because of a 247 bp deletion in exon 3. It’s very likely 
that your pet cat has never experienced the taste of sweetness. 
That explains a lot about cats. 

▲ Cats are carnivores. They probably can’t 
taste sweetness. 

8.6 Polysaccharides 

Polysaccharides are frequently divided into two broad classes. Homoglycans, or ho- 
mopolysaccharides, are polymers containing residues of only one type of monosaccha- 
ride. Heteroglycans, or heteropolysaccharides, are polymers containing residues of 
more than one type of monosaccharide. Polysaccharides are created without a tem- 
plate by the addition of particular monosaccharide and oligosaccharide residues. As a 
result, the lengths and compositions of polysaccharide molecules may vary within 
a population of these molecules. Some common polysaccharides and their structures 
are listed in Table 8.2. 

Most polysaccharides can also be classified according to their biological roles. For 
example, starch and glycogen are storage polysaccharides while cellulose and chitin are 
structural polysaccharides. We will see additional examples of the variety and versatility 
of carbohydrates when we discuss the heteroglycans in the next section.” 

A. Starch and Glycogen 

D-Glucose is synthesized in all species. Excess glucose can be broken down to produce 
metabolic energy. Glucose residues are stored as polysaccharides until they are needed for 
energy production. The most common storage homoglycan of glucose in plants and fungi 
is starch and in animals it is glycogen. Both types of polysaccharides occur in bacteria. 

8.6 Polysaccharides 241 

Table 8.2 Structures of some common polysaccharides 




Storage homoglycans 




a-( 1 —* 4) 



a-(1 — » 4), a-(1 — » 6) (branches) 


Structural homoglycans 


a-(1 — » 4), cx- ( 1 — >6) (branches) 



0 ( 1 - 4 ) 







(amino sugars, sugar acids) 


Hyaluronic acid 

GlcUA and GIcNAc 

0(1 -3), 0(1 -*4) 

°Polysaccharides are unbranched unless otherwise indicated. 
fa Glc, Glucose; GIcNAc, N-acetylglucosamine; GlclIA, D-glucuronate. 


Vanillin /3-D-glucoside 

Starch is present in plant cells as a mixture of amylose and amylopectin and is 
stored in granules whose diameters range from 3 to 100 /mm. Amylose is an unbranched 
polymer of about 100 to 1000 D-glucose residues connected by a-(l — » 4) glycosidic 
linkages, specifically termed a- (l — > 4) glucosidic bonds because the anomeric carbons 
belong to glucose residues (Figure 8.22a). The same type of linkage connects glucose 
monomers in the disaccharide maltose (Figure 8.20a). Although it is not truly soluble in 
water, amylose forms hydrated micelles in water and can assume a helical structure 
under some conditions (Figure 8.22b). 

Amylopectin is a branched version of amylose (Figure 8.23). Branches, or poly- 
meric side chains, are attached via a-(l — >6) glucosidic bonds to linear chains of 
residues linked by a- (l — » 4) glucosidic bonds. Branching occurs, on average, once 
every 25 residues and the side chains contain about 15 to 25 glucose residues. Some side 
chains themselves are branched. Amylopectin molecules isolated from living cells may 
contain 300 to 6000 glucose residues. 

An adult human consumes about 300 g of carbohydrate daily, much of which is in 
the form of starch. Raw starch granules resist enzymatic hydrolysis but cooking causes 
them to absorb water and swell. The swollen starch is a substrate for two different gly- 
cosidases. Dietary starch is degraded in the gastrointestinal tract by the actions of a- 
amylase and a debranching enzyme, a- Amylase, which is present in both animals and 

/3-D-Galactosyl 1 -glycerol 
▲ Figure 8.21 

Structures of three glycosides. The nonsugar 
components are shown in blue, (a) Guano- 
sine. (b) Vanillin glucoside, the flavored com- 
pound in vanilla extract, (c) /3-D-Galactosyl 
1-glycerol, derivatives of which are common 
in eukaryotic cell membranes. 

Starch metabolism is described in 
Chapter 15. 



CH 2 OH 

CH 7 OH 

CH 2 OH 

▲ Figure 8.22 

Amylose. (a) Structure of amylose. Amylose, one form of starch, is a linear polymer of glucose 
residues linked by a-( 1 -^4)-D-glucosidic bonds, (b) Amylose can assume a left-handed helical 
conformation, which is hydrated on the inside as well as on the outer surface. 

242 CHAPTER 8 Carbohydrates 

Figure 8.23 ► 

Structure of amylopectin. Amylopectin, a 
second form of starch, is a branched polymer. 
The linear glucose residues of the main 
chain and the side chains of amylopectin 
are linked by a- (l -^4)-D-glucosidic bonds, 
and the side chains are linked to the main 
chain by a-( 1 —> 6)-D-glucosidic bonds. 


plants, is an endoglycosidase (it acts on internal glycosidic bonds). The enzyme catalyzes 
random hydrolysis of the a- (l — » 4) glucosidic bonds of amylose and amylopectin. 

Another hydrolase, /3- amylase, is found in the seeds and tubers of some plants. 
/3- Amylase is an exoglycosidase (it acts on terminal glycosidic bonds). It catalyzes se- 
quential hydrolytic release of maltose from the free, nonreducing ends of amylopectin. 

Despite their a and /3 designations, both types of amylases act only ona-(1^4)-D- 
glycosidic bonds. Figure 8.24 shows the action of a- amylase and /3-amylase on amy- 
lopectin. The a- (l — > 6) linkages at branch points are not substrates for either a- or 
/3- amylase. After amylase-catalyzed hydrolysis of amylopectin, highly branched cores re- 
sistant to further hydrolysis, called limit dextrins, remain. Limit dextrins can be further 
degraded only after debranching enzymes have catalyzed hydrolysis of the a- (l —> 6) 
linkages at branch points. 

Glycogen is also a branched polymer of glucose residues. Glycogen contains the 
same types of linkages found in amylopectin but the branches in glycogen are smaller 
and more frequent, occurring every 8-12 residues. In general, glycogen molecules are 
larger than starch molecules, Glycogen up to contains 50,000 glucose residues. In mammals, 

Figure 8.24 ► 

Action of a-amylase and /3-amylase on 
amylopectin. a-Amylase catalyzes random 
hydrolysis of internal a-( 1 — >4) glucosidic 
bonds; /3-amylase acts on the nonreducing 
ends. Each hexagon represents a glucose 
residue; the single reducing end of the 
branched polymer is red. (An actual amy- 
lopectin molecule contains many more 
glucose residues than shown here.) 

8.6 Polysaccharides 243 

depending on the nutritional state, glycogen can account for up to 10% of the mass of 
the liver and 2% of the mass of muscle. 

The branched structures of amylopectin and glycogen possess only one reducing 
end but many nonreducing ends. The reducing end of glycogen is covalently attached to 
a protein called glycogenin (Section 12. 5 A). Enzymatic lengthening and degradation of 
polysaccharide chains occurs at the nonreducing ends. 

Enzymes that catalyze the intracellular 
synthesis and breakdown of glycogen 
are described in Chapter 12. 

B. Cellulose 

Cellulose is a structural polysaccharide. It is a major component of the rigid cell walls 
that surround many plant cells. The stems and branches of many plants consist largely 
of cellulose. This single polysaccharide accounts for a significant percentage of all or- 
ganic matter on Earth. Like amylose, cellulose is a linear polymer of glucose residues, 
but in cellulose the glucose residues are joined by /?-( 1 — >4) linkages rather than 
a-( 1 — > 4) linkages. The two glucose residues of the disaccharide cellobiose also are 
connected by a /3-(